SlideShare ist ein Scribd-Unternehmen logo
1 von 97
Downloaden Sie, um offline zu lesen
Chubby Lock Service
Amir H. Payberah
amir@sics.se
Amirkabir University of Technology
(Tehran Polytechnic)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 1 / 38
What is the Problem?
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 2 / 38
What is the Problem?
Large distributed system
How to ensure that only one process across a fleet of processes acts
on a resource?
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 3 / 38
What is the Problem?
Large distributed system
How to ensure that only one process across a fleet of processes acts
on a resource?
• Ensure only one server can write to a database.
• Ensure that only one server can perform a particular action.
• Ensure that there is a single master that processes all writes.
• ...
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 3 / 38
What is the Problem?
All these scenarios need a simple way to coordinate execution of
process to ensure that only one entity is performing an action.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 4 / 38
What is the Problem?
All these scenarios need a simple way to coordinate execution of
process to ensure that only one entity is performing an action.
We need agreement (consensus)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 4 / 38
What is the Problem?
All these scenarios need a simple way to coordinate execution of
process to ensure that only one entity is performing an action.
We need agreement (consensus)
Paxos ...
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 4 / 38
How Can We Use Paxos
to Solve the Problem
of Coordination?
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 5 / 38
Possible Solutions
Building a consensus library
Building a centralized lock service
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 6 / 38
Consensus Library vs. Centralized Service
Consensus library
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
• Need to wait on majority (quorums).
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
• Need to wait on majority (quorums).
Centralized lock service
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
• Need to wait on majority (quorums).
Centralized lock service
• Clean interface.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
• Need to wait on majority (quorums).
Centralized lock service
• Clean interface.
• Service can be modified independently.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
• Need to wait on majority (quorums).
Centralized lock service
• Clean interface.
• Service can be modified independently.
• A single client can obtain a lock and make progress (non-quorum
based decisions, the lock service takes care of it)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Consensus Library vs. Centralized Service
Consensus library
• Library code has to be included in every program.
• Application developers to be experts in distributed system.
• Need to wait on majority (quorums).
Centralized lock service
• Clean interface.
• Service can be modified independently.
• A single client can obtain a lock and make progress (non-quorum
based decisions, the lock service takes care of it)
• Application developers do not have to worry about dealing with
operations, various failure modes debugging etc.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
Centralized Lock Service
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 8 / 38
Chubby is a Lock Service
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 9 / 38
Chubby
Primary goals: reliability and availability (not high performance)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 10 / 38
Chubby
Primary goals: reliability and availability (not high performance)
Used in Google: GFS, Bigtable, etc.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 10 / 38
Chubby Structure
Two main components:
• Server (chubby cell)
• Client library
Communicate via RPC
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 11 / 38
Chubby Cell
A small set of servers (replicas)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
Chubby Cell
A small set of servers (replicas)
One is the master (elected from the replicas via Paxos)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
Chubby Cell
A small set of servers (replicas)
One is the master (elected from the replicas via Paxos)
Maintain copies of simple database (replicated state machines)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
Chubby Cell
A small set of servers (replicas)
One is the master (elected from the replicas via Paxos)
Maintain copies of simple database (replicated state machines)
• Only the master initiates reads and writes of this database.
• All other replicas simply copy updates from the master (using the
Paxos protocol).
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
Chubby Client
Find the master: sending master location requests to replicas.
All requests are sent directly to the master.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 13 / 38
Chubby Operations
Write requests
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
Chubby Operations
Write requests
• The master receives the request.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
Chubby Operations
Write requests
• The master receives the request.
• Write requests are propagated via the consensus protocol to all
replicas.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
Chubby Operations
Write requests
• The master receives the request.
• Write requests are propagated via the consensus protocol to all
replicas.
• Acknowledges when write request reaches the majority of the
replicas
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
Chubby Operations
Write requests
• The master receives the request.
• Write requests are propagated via the consensus protocol to all
replicas.
• Acknowledges when write request reaches the majority of the
replicas
Read requests
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
Chubby Operations
Write requests
• The master receives the request.
• Write requests are propagated via the consensus protocol to all
replicas.
• Acknowledges when write request reaches the majority of the
replicas
Read requests
• Satisfied by the master alone.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
Let’s Go into More Details
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 15 / 38
Chubby Interface (1/2)
Chubby exports a unix-like filesystem interface.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
Chubby Interface (1/2)
Chubby exports a unix-like filesystem interface.
Tree of files with names separated by slashes (namesapace):
/ls/cell1/aut/cc14:
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
Chubby Interface (1/2)
Chubby exports a unix-like filesystem interface.
Tree of files with names separated by slashes (namesapace):
/ls/cell1/aut/cc14:
• 1st component (ls): lock service (common to all names)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
Chubby Interface (1/2)
Chubby exports a unix-like filesystem interface.
Tree of files with names separated by slashes (namesapace):
/ls/cell1/aut/cc14:
• 1st component (ls): lock service (common to all names)
• 2nd component (cell1): the chubby cell name
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
Chubby Interface (1/2)
Chubby exports a unix-like filesystem interface.
Tree of files with names separated by slashes (namesapace):
/ls/cell1/aut/cc14:
• 1st component (ls): lock service (common to all names)
• 2nd component (cell1): the chubby cell name
• The rest: the name of the directory and the file (name inside the
cell)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
Chubby Interface (2/2)
Chubby maintains a namespace that contains files and directories,
called nodes.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
Chubby Interface (2/2)
Chubby maintains a namespace that contains files and directories,
called nodes.
Clients can create nodes and add contents to nodes.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
Chubby Interface (2/2)
Chubby maintains a namespace that contains files and directories,
called nodes.
Clients can create nodes and add contents to nodes.
Support most normal operations:
• create, delete, open, write, ...
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
Chubby Interface (2/2)
Chubby maintains a namespace that contains files and directories,
called nodes.
Clients can create nodes and add contents to nodes.
Support most normal operations:
• create, delete, open, write, ...
Clients open nodes to obtain handles that are analogous to unix file
descriptors.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
Locks (1/2)
Each Chubby file and directory (node) can act as a reader/writer
lock.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 18 / 38
Locks (1/2)
Each Chubby file and directory (node) can act as a reader/writer
lock.
A client handle may hold the lock in two modes: writer (exclusive)
mode or reader (shared) mode.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 18 / 38
Locks (1/2)
Each Chubby file and directory (node) can act as a reader/writer
lock.
A client handle may hold the lock in two modes: writer (exclusive)
mode or reader (shared) mode.
• Only one client can hold lock in writer mode.
• Many clients can hold lock in reader mode.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 18 / 38
Locks (2/2)
Locks are advisory in Chubby.
• Holding a lock is not necessary to access file.
• Holding a lock does not prevent other clients accessing file.
• Conflicts only with other attempts to acquire the same lock.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 19 / 38
Chubby Example (1/2)
Electing a primary (leader) process.
Steps in primary election:
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
Chubby Example (1/2)
Electing a primary (leader) process.
Steps in primary election:
• Clients open a lock file and attempt to acquire the lock in write
mode.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
Chubby Example (1/2)
Electing a primary (leader) process.
Steps in primary election:
• Clients open a lock file and attempt to acquire the lock in write
mode.
• One client succeeds (becomes the primary) and writes its
name/identity into the file.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
Chubby Example (1/2)
Electing a primary (leader) process.
Steps in primary election:
• Clients open a lock file and attempt to acquire the lock in write
mode.
• One client succeeds (becomes the primary) and writes its
name/identity into the file.
• Other clients fail (become replicas) and discover the name of the
primary by reading the file.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
Chubby Example (1/2)
Electing a primary (leader) process.
Steps in primary election:
• Clients open a lock file and attempt to acquire the lock in write
mode.
• One client succeeds (becomes the primary) and writes its
name/identity into the file.
• Other clients fail (become replicas) and discover the name of the
primary by reading the file.
• Primary obtains sequencer and passes it to servers (servers can
insure that the elected primary is still valid).
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
Chubby Example (2/2)
Leader election
Open("/ls/foo/OurServicePrimary", "write mode")
if (successful) { // primary
SetContents(primary_identity)
} else { // replica
Open("/ls/foo/OurServicePrimary", "read mode",
"file-modification event")
when notified of file modification:
primary = GetContentsAndStat()
}
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 21 / 38
APIs
Open(), Close(), Delete()
GetContentsAndStat(), GetStat(), ReadDir()
SetContents()
SetACL()
Locks: Acquire(), TryAcquire(), Release()
Sequencers: GetSequencer(), SetSequencer(),
CheckSequencer()
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 22 / 38
Events
Chubby clients may subscribe to many events when they create a
handle.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 23 / 38
Events
Chubby clients may subscribe to many events when they create a
handle.
Events include:
• File contents modified
• Child node added, removed, or modified
• Chubby master failed over
• Handle has become invalid
• Lock acquired
• Conflicting lock requested from another client
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 23 / 38
Locking - Problem
After acquiring a lock and issuing a request (R1) a client may fail.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 24 / 38
Locking - Problem
After acquiring a lock and issuing a request (R1) a client may fail.
Another client may acquire the lock and issue its own request (R2).
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 24 / 38
Locking - Problem
After acquiring a lock and issuing a request (R1) a client may fail.
Another client may acquire the lock and issue its own request (R2).
R1 arrive later at the server and be acted upon (possible inconsis-
tency).
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 24 / 38
Locking - Solutions (1/3)
Solution 1: virtual time
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 25 / 38
Locking - Solutions (1/3)
Solution 1: virtual time
• But, it is costly to introduce sequence numbers into all the
interactions.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 25 / 38
Locking - Solutions (2/3)
Solution 2: sequencers
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
Locking - Solutions (2/3)
Solution 2: sequencers
• Sequence numbers are introduced into only those interactions that
make use of locks.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
Locking - Solutions (2/3)
Solution 2: sequencers
• Sequence numbers are introduced into only those interactions that
make use of locks.
• A lock holder can obtain a sequencer from Chubby.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
Locking - Solutions (2/3)
Solution 2: sequencers
• Sequence numbers are introduced into only those interactions that
make use of locks.
• A lock holder can obtain a sequencer from Chubby.
• It attaches the sequencer to any requests that it sends to other
servers (e.g., Bigtable)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
Locking - Solutions (2/3)
Solution 2: sequencers
• Sequence numbers are introduced into only those interactions that
make use of locks.
• A lock holder can obtain a sequencer from Chubby.
• It attaches the sequencer to any requests that it sends to other
servers (e.g., Bigtable)
• The other servers can verify the sequencer information
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
Locking - Solutions (3/3)
Solution 3: lock-delay
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 27 / 38
Locking - Solutions (3/3)
Solution 3: lock-delay
• Correctly released locks are immediately available.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 27 / 38
Locking - Solutions (3/3)
Solution 3: lock-delay
• Correctly released locks are immediately available.
• If a lock becomes free because holder failed or becomes inaccessible,
lock cannot be claimed by another client until lock-delay expires.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 27 / 38
Caching
To reduce read traffic, Chubby clients cache file data and node
meta-data.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
Caching
To reduce read traffic, Chubby clients cache file data and node
meta-data.
Write-through: write is done synchronously both to the cache and
to the primary copy.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
Caching
To reduce read traffic, Chubby clients cache file data and node
meta-data.
Write-through: write is done synchronously both to the cache and
to the primary copy.
Master will invalidate cached copies upon a write request.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
Caching
To reduce read traffic, Chubby clients cache file data and node
meta-data.
Write-through: write is done synchronously both to the cache and
to the primary copy.
Master will invalidate cached copies upon a write request.
The client also can allow its cache lease to expire.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
Session Lease (1/3)
Lease: a promise by one party to abide by an agreement for a given
interval of time unless the promise is explicitly revoked.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
Session Lease (1/3)
Lease: a promise by one party to abide by an agreement for a given
interval of time unless the promise is explicitly revoked.
Session: a relationship between a Chubby cell and a Chubby client.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
Session Lease (1/3)
Lease: a promise by one party to abide by an agreement for a given
interval of time unless the promise is explicitly revoked.
Session: a relationship between a Chubby cell and a Chubby client.
Session lease: interval during which client’s cached items (handles,
locks, files) are valid.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
Session Lease (1/3)
Lease: a promise by one party to abide by an agreement for a given
interval of time unless the promise is explicitly revoked.
Session: a relationship between a Chubby cell and a Chubby client.
Session lease: interval during which client’s cached items (handles,
locks, files) are valid.
Session maintained through KeepAlives messages.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
Session Lease (2/3)
If client’s local lease expires (happens when a master fails over)
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
Session Lease (2/3)
If client’s local lease expires (happens when a master fails over)
• Client disables cache.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
Session Lease (2/3)
If client’s local lease expires (happens when a master fails over)
• Client disables cache.
• Session in jeopardy, client waits in grace period (45s).
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
Session Lease (2/3)
If client’s local lease expires (happens when a master fails over)
• Client disables cache.
• Session in jeopardy, client waits in grace period (45s).
• Sends jeopardy event to application.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
Session Lease (2/3)
If client’s local lease expires (happens when a master fails over)
• Client disables cache.
• Session in jeopardy, client waits in grace period (45s).
• Sends jeopardy event to application.
• Application can suspend activity.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
Session Lease (3/3)
Client attempts to exchange KeepAlive messages with master during
grace period.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 31 / 38
Session Lease (3/3)
Client attempts to exchange KeepAlive messages with master during
grace period.
• Succeed: re-enables cache; send safe event to application
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 31 / 38
Session Lease (3/3)
Client attempts to exchange KeepAlive messages with master during
grace period.
• Succeed: re-enables cache; send safe event to application
• Failed: discards cache; sends expired event to application
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 31 / 38
Scaling
Reducing communication with the master.
Two techniques:
• Proxies
• Partitioning
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 32 / 38
Scaling - Proxies
Proxy is a trusted process that pass requests from other clients to
a Chubby cell.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
Scaling - Proxies
Proxy is a trusted process that pass requests from other clients to
a Chubby cell.
Can handle KeepAlives and reads → reduce the master load.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
Scaling - Proxies
Proxy is a trusted process that pass requests from other clients to
a Chubby cell.
Can handle KeepAlives and reads → reduce the master load.
Cannot reduce write loads, but they are << 1% of workload.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
Scaling - Proxies
Proxy is a trusted process that pass requests from other clients to
a Chubby cell.
Can handle KeepAlives and reads → reduce the master load.
Cannot reduce write loads, but they are << 1% of workload.
Introduces another point of failure.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
Scaling - Partitioning
Namespace partitioned between servers.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 34 / 38
Scaling - Partitioning
Namespace partitioned between servers.
N partitions, each with master and replicas
Node D/C stored on P(D/C) = hash(D) mod N
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 34 / 38
Summary
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 35 / 38
Summary
Library vs. Service
Chubby: coarse-grained lock service
Chubby cell and clients
Unix-like interface
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 36 / 38
References:
M. Burrows, The Chubby lock service for loosely-coupled distributed
systems, OSDI, 2006.
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 37 / 38
Questions?
Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 38 / 38

Weitere ähnliche Inhalte

Was ist angesagt?

Multithreading models.ppt
Multithreading models.pptMultithreading models.ppt
Multithreading models.ppt
Luis Goldster
 
Fault Tolerance System
Fault Tolerance SystemFault Tolerance System
Fault Tolerance System
Ehsan Ilahi
 
Classical problem of synchronization
Classical problem of synchronizationClassical problem of synchronization
Classical problem of synchronization
Shakshi Ranawat
 
17. Recovery System in DBMS
17. Recovery System in DBMS17. Recovery System in DBMS
17. Recovery System in DBMS
koolkampus
 

Was ist angesagt? (20)

Computer architecture page replacement algorithms
Computer architecture page replacement algorithmsComputer architecture page replacement algorithms
Computer architecture page replacement algorithms
 
The Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsThe Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systems
 
Types of grammer - TOC
Types of grammer - TOCTypes of grammer - TOC
Types of grammer - TOC
 
Concurrency Control Techniques
Concurrency Control TechniquesConcurrency Control Techniques
Concurrency Control Techniques
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 
VIRTUAL MEMORY
VIRTUAL MEMORYVIRTUAL MEMORY
VIRTUAL MEMORY
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Multiprocessor Architecture (Advanced computer architecture)
Multiprocessor Architecture  (Advanced computer architecture)Multiprocessor Architecture  (Advanced computer architecture)
Multiprocessor Architecture (Advanced computer architecture)
 
Concurrency control!
Concurrency control!Concurrency control!
Concurrency control!
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
 
Multithreading models.ppt
Multithreading models.pptMultithreading models.ppt
Multithreading models.ppt
 
Fault Tolerance System
Fault Tolerance SystemFault Tolerance System
Fault Tolerance System
 
Contiguous Memory Allocation-R.D.Sivakumar
Contiguous Memory Allocation-R.D.SivakumarContiguous Memory Allocation-R.D.Sivakumar
Contiguous Memory Allocation-R.D.Sivakumar
 
Pram model
Pram modelPram model
Pram model
 
Classical problem of synchronization
Classical problem of synchronizationClassical problem of synchronization
Classical problem of synchronization
 
Deadlock Detection
Deadlock DetectionDeadlock Detection
Deadlock Detection
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platforms
 
17. Recovery System in DBMS
17. Recovery System in DBMS17. Recovery System in DBMS
17. Recovery System in DBMS
 
The CAP Theorem
The CAP Theorem The CAP Theorem
The CAP Theorem
 
Swapping | Computer Science
Swapping | Computer ScienceSwapping | Computer Science
Swapping | Computer Science
 

Andere mochten auch

Ow2 Jonas Use Case Ministere Interieur Open World Forum
Ow2 Jonas Use Case Ministere Interieur Open World ForumOw2 Jonas Use Case Ministere Interieur Open World Forum
Ow2 Jonas Use Case Ministere Interieur Open World Forum
OW2
 
Portland Views
Portland ViewsPortland Views
Portland Views
gardenmam
 
Libby's Tips for Getting Unstuck
Libby's Tips for Getting UnstuckLibby's Tips for Getting Unstuck
Libby's Tips for Getting Unstuck
Libby Gill
 
Chapter 2 power point
Chapter 2 power pointChapter 2 power point
Chapter 2 power point
dphil002
 
Inquiry presentation
Inquiry presentationInquiry presentation
Inquiry presentation
wall530
 
What Do You Feel 2008
What Do You Feel 2008What Do You Feel 2008
What Do You Feel 2008
renee22220
 
Conpaas Elastic Cloud, OW2con 2011, Nov 24-25, Paris
Conpaas Elastic Cloud, OW2con 2011, Nov 24-25, ParisConpaas Elastic Cloud, OW2con 2011, Nov 24-25, Paris
Conpaas Elastic Cloud, OW2con 2011, Nov 24-25, Paris
OW2
 
Ow2 X Wiki Use Case Open World Forum09
Ow2 X Wiki Use Case Open World Forum09Ow2 X Wiki Use Case Open World Forum09
Ow2 X Wiki Use Case Open World Forum09
OW2
 
Life Beautiful Monday
Life Beautiful MondayLife Beautiful Monday
Life Beautiful Monday
Pentiux
 
Paisajes De Serge Motylev
Paisajes De Serge MotylevPaisajes De Serge Motylev
Paisajes De Serge Motylev
alfcoltrane
 
UShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, Paris
UShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, ParisUShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, Paris
UShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, Paris
OW2
 
Migration Novaforge OW2 Conference Nov10
Migration Novaforge OW2 Conference Nov10Migration Novaforge OW2 Conference Nov10
Migration Novaforge OW2 Conference Nov10
OW2
 
CompatibleOne Project, OW2con 2011, Nov 24-25, Paris
CompatibleOne Project, OW2con 2011, Nov 24-25, ParisCompatibleOne Project, OW2con 2011, Nov 24-25, Paris
CompatibleOne Project, OW2con 2011, Nov 24-25, Paris
OW2
 
XWiki OW2 Conference Nov10
XWiki OW2 Conference Nov10XWiki OW2 Conference Nov10
XWiki OW2 Conference Nov10
OW2
 
La Casa Invisible
La Casa InvisibleLa Casa Invisible
La Casa Invisible
Crisis 999
 

Andere mochten auch (20)

Resume Infographic
Resume InfographicResume Infographic
Resume Infographic
 
AcceDe Web, a Guide for Accessibility Web Projects, OW2con'16, Paris.
AcceDe Web, a Guide for Accessibility Web Projects, OW2con'16, Paris.  AcceDe Web, a Guide for Accessibility Web Projects, OW2con'16, Paris.
AcceDe Web, a Guide for Accessibility Web Projects, OW2con'16, Paris.
 
Ow2 Jonas Use Case Ministere Interieur Open World Forum
Ow2 Jonas Use Case Ministere Interieur Open World ForumOw2 Jonas Use Case Ministere Interieur Open World Forum
Ow2 Jonas Use Case Ministere Interieur Open World Forum
 
Portland Views
Portland ViewsPortland Views
Portland Views
 
Libby's Tips for Getting Unstuck
Libby's Tips for Getting UnstuckLibby's Tips for Getting Unstuck
Libby's Tips for Getting Unstuck
 
Chapter 2 power point
Chapter 2 power pointChapter 2 power point
Chapter 2 power point
 
OW2con'14 - XLcloud, a demonstation of 3D remote rendering in the cloud
OW2con'14 - XLcloud, a demonstation of 3D remote rendering in the cloudOW2con'14 - XLcloud, a demonstation of 3D remote rendering in the cloud
OW2con'14 - XLcloud, a demonstation of 3D remote rendering in the cloud
 
Inquiry presentation
Inquiry presentationInquiry presentation
Inquiry presentation
 
What Do You Feel 2008
What Do You Feel 2008What Do You Feel 2008
What Do You Feel 2008
 
Star Animation I
Star Animation IStar Animation I
Star Animation I
 
Why no one loves me? Using BI techniques to make my business more attractive,...
Why no one loves me? Using BI techniques to make my business more attractive,...Why no one loves me? Using BI techniques to make my business more attractive,...
Why no one loves me? Using BI techniques to make my business more attractive,...
 
Conpaas Elastic Cloud, OW2con 2011, Nov 24-25, Paris
Conpaas Elastic Cloud, OW2con 2011, Nov 24-25, ParisConpaas Elastic Cloud, OW2con 2011, Nov 24-25, Paris
Conpaas Elastic Cloud, OW2con 2011, Nov 24-25, Paris
 
Ow2 X Wiki Use Case Open World Forum09
Ow2 X Wiki Use Case Open World Forum09Ow2 X Wiki Use Case Open World Forum09
Ow2 X Wiki Use Case Open World Forum09
 
Life Beautiful Monday
Life Beautiful MondayLife Beautiful Monday
Life Beautiful Monday
 
Paisajes De Serge Motylev
Paisajes De Serge MotylevPaisajes De Serge Motylev
Paisajes De Serge Motylev
 
UShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, Paris
UShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, ParisUShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, Paris
UShareSoft Software onboarding to cloud, OW2con11, Nov 24-25, Paris
 
Migration Novaforge OW2 Conference Nov10
Migration Novaforge OW2 Conference Nov10Migration Novaforge OW2 Conference Nov10
Migration Novaforge OW2 Conference Nov10
 
CompatibleOne Project, OW2con 2011, Nov 24-25, Paris
CompatibleOne Project, OW2con 2011, Nov 24-25, ParisCompatibleOne Project, OW2con 2011, Nov 24-25, Paris
CompatibleOne Project, OW2con 2011, Nov 24-25, Paris
 
XWiki OW2 Conference Nov10
XWiki OW2 Conference Nov10XWiki OW2 Conference Nov10
XWiki OW2 Conference Nov10
 
La Casa Invisible
La Casa InvisibleLa Casa Invisible
La Casa Invisible
 

Ähnlich wie Chubby

Ähnlich wie Chubby (20)

Introduction to Operating Systems - Part2
Introduction to Operating Systems - Part2Introduction to Operating Systems - Part2
Introduction to Operating Systems - Part2
 
Deadlocks
DeadlocksDeadlocks
Deadlocks
 
Bigtable
BigtableBigtable
Bigtable
 
Main Memory - Part1
Main Memory - Part1Main Memory - Part1
Main Memory - Part1
 
Process Management - Part2
Process Management - Part2Process Management - Part2
Process Management - Part2
 
Introduction to Operating Systems - Part1
Introduction to Operating Systems - Part1Introduction to Operating Systems - Part1
Introduction to Operating Systems - Part1
 
MegaStore and Spanner
MegaStore and SpannerMegaStore and Spanner
MegaStore and Spanner
 
Process Management - Part1
Process Management - Part1Process Management - Part1
Process Management - Part1
 
Introduction to stream processing
Introduction to stream processingIntroduction to stream processing
Introduction to stream processing
 
Threads
ThreadsThreads
Threads
 
Introduction to cloud computing and big data - part1
Introduction to cloud computing and big data - part1Introduction to cloud computing and big data - part1
Introduction to cloud computing and big data - part1
 
Introduction to Operating Systems - Part3
Introduction to Operating Systems - Part3Introduction to Operating Systems - Part3
Introduction to Operating Systems - Part3
 
Google File System
Google File SystemGoogle File System
Google File System
 
Process Synchronization - Part2
Process Synchronization -  Part2Process Synchronization -  Part2
Process Synchronization - Part2
 
Mesos and YARN
Mesos and YARNMesos and YARN
Mesos and YARN
 
Virtual Memory - Part1
Virtual Memory - Part1Virtual Memory - Part1
Virtual Memory - Part1
 
Virtual Memory - Part2
Virtual Memory - Part2Virtual Memory - Part2
Virtual Memory - Part2
 
Paxos
PaxosPaxos
Paxos
 
Protection
ProtectionProtection
Protection
 
Process Synchronization - Part1
Process Synchronization -  Part1Process Synchronization -  Part1
Process Synchronization - Part1
 

Mehr von Amir Payberah (16)

P2P Content Distribution Network
P2P Content Distribution NetworkP2P Content Distribution Network
P2P Content Distribution Network
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Data Intensive Computing Frameworks
Data Intensive Computing FrameworksData Intensive Computing Frameworks
Data Intensive Computing Frameworks
 
The Stratosphere Big Data Analytics Platform
The Stratosphere Big Data Analytics PlatformThe Stratosphere Big Data Analytics Platform
The Stratosphere Big Data Analytics Platform
 
The Spark Big Data Analytics Platform
The Spark Big Data Analytics PlatformThe Spark Big Data Analytics Platform
The Spark Big Data Analytics Platform
 
Security
SecuritySecurity
Security
 
Linux Module Programming
Linux Module ProgrammingLinux Module Programming
Linux Module Programming
 
IO Systems
IO SystemsIO Systems
IO Systems
 
File System Implementation - Part2
File System Implementation - Part2File System Implementation - Part2
File System Implementation - Part2
 
File System Implementation - Part1
File System Implementation - Part1File System Implementation - Part1
File System Implementation - Part1
 
File System Interface
File System InterfaceFile System Interface
File System Interface
 
Storage
StorageStorage
Storage
 
Main Memory - Part2
Main Memory - Part2Main Memory - Part2
Main Memory - Part2
 
CPU Scheduling - Part2
CPU Scheduling - Part2CPU Scheduling - Part2
CPU Scheduling - Part2
 
CPU Scheduling - Part1
CPU Scheduling - Part1CPU Scheduling - Part1
CPU Scheduling - Part1
 
Process Management - Part3
Process Management - Part3Process Management - Part3
Process Management - Part3
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Chubby

  • 1. Chubby Lock Service Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 1 / 38
  • 2. What is the Problem? Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 2 / 38
  • 3. What is the Problem? Large distributed system How to ensure that only one process across a fleet of processes acts on a resource? Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 3 / 38
  • 4. What is the Problem? Large distributed system How to ensure that only one process across a fleet of processes acts on a resource? • Ensure only one server can write to a database. • Ensure that only one server can perform a particular action. • Ensure that there is a single master that processes all writes. • ... Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 3 / 38
  • 5. What is the Problem? All these scenarios need a simple way to coordinate execution of process to ensure that only one entity is performing an action. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 4 / 38
  • 6. What is the Problem? All these scenarios need a simple way to coordinate execution of process to ensure that only one entity is performing an action. We need agreement (consensus) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 4 / 38
  • 7. What is the Problem? All these scenarios need a simple way to coordinate execution of process to ensure that only one entity is performing an action. We need agreement (consensus) Paxos ... Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 4 / 38
  • 8. How Can We Use Paxos to Solve the Problem of Coordination? Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 5 / 38
  • 9. Possible Solutions Building a consensus library Building a centralized lock service Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 6 / 38
  • 10. Consensus Library vs. Centralized Service Consensus library Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 11. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 12. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 13. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. • Need to wait on majority (quorums). Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 14. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. • Need to wait on majority (quorums). Centralized lock service Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 15. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. • Need to wait on majority (quorums). Centralized lock service • Clean interface. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 16. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. • Need to wait on majority (quorums). Centralized lock service • Clean interface. • Service can be modified independently. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 17. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. • Need to wait on majority (quorums). Centralized lock service • Clean interface. • Service can be modified independently. • A single client can obtain a lock and make progress (non-quorum based decisions, the lock service takes care of it) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 18. Consensus Library vs. Centralized Service Consensus library • Library code has to be included in every program. • Application developers to be experts in distributed system. • Need to wait on majority (quorums). Centralized lock service • Clean interface. • Service can be modified independently. • A single client can obtain a lock and make progress (non-quorum based decisions, the lock service takes care of it) • Application developers do not have to worry about dealing with operations, various failure modes debugging etc. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 7 / 38
  • 19. Centralized Lock Service Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 8 / 38
  • 20. Chubby is a Lock Service Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 9 / 38
  • 21. Chubby Primary goals: reliability and availability (not high performance) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 10 / 38
  • 22. Chubby Primary goals: reliability and availability (not high performance) Used in Google: GFS, Bigtable, etc. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 10 / 38
  • 23. Chubby Structure Two main components: • Server (chubby cell) • Client library Communicate via RPC Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 11 / 38
  • 24. Chubby Cell A small set of servers (replicas) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
  • 25. Chubby Cell A small set of servers (replicas) One is the master (elected from the replicas via Paxos) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
  • 26. Chubby Cell A small set of servers (replicas) One is the master (elected from the replicas via Paxos) Maintain copies of simple database (replicated state machines) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
  • 27. Chubby Cell A small set of servers (replicas) One is the master (elected from the replicas via Paxos) Maintain copies of simple database (replicated state machines) • Only the master initiates reads and writes of this database. • All other replicas simply copy updates from the master (using the Paxos protocol). Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 12 / 38
  • 28. Chubby Client Find the master: sending master location requests to replicas. All requests are sent directly to the master. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 13 / 38
  • 29. Chubby Operations Write requests Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
  • 30. Chubby Operations Write requests • The master receives the request. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
  • 31. Chubby Operations Write requests • The master receives the request. • Write requests are propagated via the consensus protocol to all replicas. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
  • 32. Chubby Operations Write requests • The master receives the request. • Write requests are propagated via the consensus protocol to all replicas. • Acknowledges when write request reaches the majority of the replicas Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
  • 33. Chubby Operations Write requests • The master receives the request. • Write requests are propagated via the consensus protocol to all replicas. • Acknowledges when write request reaches the majority of the replicas Read requests Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
  • 34. Chubby Operations Write requests • The master receives the request. • Write requests are propagated via the consensus protocol to all replicas. • Acknowledges when write request reaches the majority of the replicas Read requests • Satisfied by the master alone. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 14 / 38
  • 35. Let’s Go into More Details Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 15 / 38
  • 36. Chubby Interface (1/2) Chubby exports a unix-like filesystem interface. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
  • 37. Chubby Interface (1/2) Chubby exports a unix-like filesystem interface. Tree of files with names separated by slashes (namesapace): /ls/cell1/aut/cc14: Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
  • 38. Chubby Interface (1/2) Chubby exports a unix-like filesystem interface. Tree of files with names separated by slashes (namesapace): /ls/cell1/aut/cc14: • 1st component (ls): lock service (common to all names) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
  • 39. Chubby Interface (1/2) Chubby exports a unix-like filesystem interface. Tree of files with names separated by slashes (namesapace): /ls/cell1/aut/cc14: • 1st component (ls): lock service (common to all names) • 2nd component (cell1): the chubby cell name Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
  • 40. Chubby Interface (1/2) Chubby exports a unix-like filesystem interface. Tree of files with names separated by slashes (namesapace): /ls/cell1/aut/cc14: • 1st component (ls): lock service (common to all names) • 2nd component (cell1): the chubby cell name • The rest: the name of the directory and the file (name inside the cell) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 16 / 38
  • 41. Chubby Interface (2/2) Chubby maintains a namespace that contains files and directories, called nodes. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
  • 42. Chubby Interface (2/2) Chubby maintains a namespace that contains files and directories, called nodes. Clients can create nodes and add contents to nodes. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
  • 43. Chubby Interface (2/2) Chubby maintains a namespace that contains files and directories, called nodes. Clients can create nodes and add contents to nodes. Support most normal operations: • create, delete, open, write, ... Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
  • 44. Chubby Interface (2/2) Chubby maintains a namespace that contains files and directories, called nodes. Clients can create nodes and add contents to nodes. Support most normal operations: • create, delete, open, write, ... Clients open nodes to obtain handles that are analogous to unix file descriptors. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 17 / 38
  • 45. Locks (1/2) Each Chubby file and directory (node) can act as a reader/writer lock. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 18 / 38
  • 46. Locks (1/2) Each Chubby file and directory (node) can act as a reader/writer lock. A client handle may hold the lock in two modes: writer (exclusive) mode or reader (shared) mode. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 18 / 38
  • 47. Locks (1/2) Each Chubby file and directory (node) can act as a reader/writer lock. A client handle may hold the lock in two modes: writer (exclusive) mode or reader (shared) mode. • Only one client can hold lock in writer mode. • Many clients can hold lock in reader mode. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 18 / 38
  • 48. Locks (2/2) Locks are advisory in Chubby. • Holding a lock is not necessary to access file. • Holding a lock does not prevent other clients accessing file. • Conflicts only with other attempts to acquire the same lock. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 19 / 38
  • 49. Chubby Example (1/2) Electing a primary (leader) process. Steps in primary election: Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
  • 50. Chubby Example (1/2) Electing a primary (leader) process. Steps in primary election: • Clients open a lock file and attempt to acquire the lock in write mode. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
  • 51. Chubby Example (1/2) Electing a primary (leader) process. Steps in primary election: • Clients open a lock file and attempt to acquire the lock in write mode. • One client succeeds (becomes the primary) and writes its name/identity into the file. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
  • 52. Chubby Example (1/2) Electing a primary (leader) process. Steps in primary election: • Clients open a lock file and attempt to acquire the lock in write mode. • One client succeeds (becomes the primary) and writes its name/identity into the file. • Other clients fail (become replicas) and discover the name of the primary by reading the file. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
  • 53. Chubby Example (1/2) Electing a primary (leader) process. Steps in primary election: • Clients open a lock file and attempt to acquire the lock in write mode. • One client succeeds (becomes the primary) and writes its name/identity into the file. • Other clients fail (become replicas) and discover the name of the primary by reading the file. • Primary obtains sequencer and passes it to servers (servers can insure that the elected primary is still valid). Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 20 / 38
  • 54. Chubby Example (2/2) Leader election Open("/ls/foo/OurServicePrimary", "write mode") if (successful) { // primary SetContents(primary_identity) } else { // replica Open("/ls/foo/OurServicePrimary", "read mode", "file-modification event") when notified of file modification: primary = GetContentsAndStat() } Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 21 / 38
  • 55. APIs Open(), Close(), Delete() GetContentsAndStat(), GetStat(), ReadDir() SetContents() SetACL() Locks: Acquire(), TryAcquire(), Release() Sequencers: GetSequencer(), SetSequencer(), CheckSequencer() Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 22 / 38
  • 56. Events Chubby clients may subscribe to many events when they create a handle. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 23 / 38
  • 57. Events Chubby clients may subscribe to many events when they create a handle. Events include: • File contents modified • Child node added, removed, or modified • Chubby master failed over • Handle has become invalid • Lock acquired • Conflicting lock requested from another client Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 23 / 38
  • 58. Locking - Problem After acquiring a lock and issuing a request (R1) a client may fail. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 24 / 38
  • 59. Locking - Problem After acquiring a lock and issuing a request (R1) a client may fail. Another client may acquire the lock and issue its own request (R2). Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 24 / 38
  • 60. Locking - Problem After acquiring a lock and issuing a request (R1) a client may fail. Another client may acquire the lock and issue its own request (R2). R1 arrive later at the server and be acted upon (possible inconsis- tency). Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 24 / 38
  • 61. Locking - Solutions (1/3) Solution 1: virtual time Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 25 / 38
  • 62. Locking - Solutions (1/3) Solution 1: virtual time • But, it is costly to introduce sequence numbers into all the interactions. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 25 / 38
  • 63. Locking - Solutions (2/3) Solution 2: sequencers Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
  • 64. Locking - Solutions (2/3) Solution 2: sequencers • Sequence numbers are introduced into only those interactions that make use of locks. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
  • 65. Locking - Solutions (2/3) Solution 2: sequencers • Sequence numbers are introduced into only those interactions that make use of locks. • A lock holder can obtain a sequencer from Chubby. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
  • 66. Locking - Solutions (2/3) Solution 2: sequencers • Sequence numbers are introduced into only those interactions that make use of locks. • A lock holder can obtain a sequencer from Chubby. • It attaches the sequencer to any requests that it sends to other servers (e.g., Bigtable) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
  • 67. Locking - Solutions (2/3) Solution 2: sequencers • Sequence numbers are introduced into only those interactions that make use of locks. • A lock holder can obtain a sequencer from Chubby. • It attaches the sequencer to any requests that it sends to other servers (e.g., Bigtable) • The other servers can verify the sequencer information Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 26 / 38
  • 68. Locking - Solutions (3/3) Solution 3: lock-delay Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 27 / 38
  • 69. Locking - Solutions (3/3) Solution 3: lock-delay • Correctly released locks are immediately available. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 27 / 38
  • 70. Locking - Solutions (3/3) Solution 3: lock-delay • Correctly released locks are immediately available. • If a lock becomes free because holder failed or becomes inaccessible, lock cannot be claimed by another client until lock-delay expires. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 27 / 38
  • 71. Caching To reduce read traffic, Chubby clients cache file data and node meta-data. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
  • 72. Caching To reduce read traffic, Chubby clients cache file data and node meta-data. Write-through: write is done synchronously both to the cache and to the primary copy. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
  • 73. Caching To reduce read traffic, Chubby clients cache file data and node meta-data. Write-through: write is done synchronously both to the cache and to the primary copy. Master will invalidate cached copies upon a write request. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
  • 74. Caching To reduce read traffic, Chubby clients cache file data and node meta-data. Write-through: write is done synchronously both to the cache and to the primary copy. Master will invalidate cached copies upon a write request. The client also can allow its cache lease to expire. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 28 / 38
  • 75. Session Lease (1/3) Lease: a promise by one party to abide by an agreement for a given interval of time unless the promise is explicitly revoked. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
  • 76. Session Lease (1/3) Lease: a promise by one party to abide by an agreement for a given interval of time unless the promise is explicitly revoked. Session: a relationship between a Chubby cell and a Chubby client. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
  • 77. Session Lease (1/3) Lease: a promise by one party to abide by an agreement for a given interval of time unless the promise is explicitly revoked. Session: a relationship between a Chubby cell and a Chubby client. Session lease: interval during which client’s cached items (handles, locks, files) are valid. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
  • 78. Session Lease (1/3) Lease: a promise by one party to abide by an agreement for a given interval of time unless the promise is explicitly revoked. Session: a relationship between a Chubby cell and a Chubby client. Session lease: interval during which client’s cached items (handles, locks, files) are valid. Session maintained through KeepAlives messages. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 29 / 38
  • 79. Session Lease (2/3) If client’s local lease expires (happens when a master fails over) Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
  • 80. Session Lease (2/3) If client’s local lease expires (happens when a master fails over) • Client disables cache. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
  • 81. Session Lease (2/3) If client’s local lease expires (happens when a master fails over) • Client disables cache. • Session in jeopardy, client waits in grace period (45s). Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
  • 82. Session Lease (2/3) If client’s local lease expires (happens when a master fails over) • Client disables cache. • Session in jeopardy, client waits in grace period (45s). • Sends jeopardy event to application. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
  • 83. Session Lease (2/3) If client’s local lease expires (happens when a master fails over) • Client disables cache. • Session in jeopardy, client waits in grace period (45s). • Sends jeopardy event to application. • Application can suspend activity. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 30 / 38
  • 84. Session Lease (3/3) Client attempts to exchange KeepAlive messages with master during grace period. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 31 / 38
  • 85. Session Lease (3/3) Client attempts to exchange KeepAlive messages with master during grace period. • Succeed: re-enables cache; send safe event to application Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 31 / 38
  • 86. Session Lease (3/3) Client attempts to exchange KeepAlive messages with master during grace period. • Succeed: re-enables cache; send safe event to application • Failed: discards cache; sends expired event to application Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 31 / 38
  • 87. Scaling Reducing communication with the master. Two techniques: • Proxies • Partitioning Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 32 / 38
  • 88. Scaling - Proxies Proxy is a trusted process that pass requests from other clients to a Chubby cell. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
  • 89. Scaling - Proxies Proxy is a trusted process that pass requests from other clients to a Chubby cell. Can handle KeepAlives and reads → reduce the master load. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
  • 90. Scaling - Proxies Proxy is a trusted process that pass requests from other clients to a Chubby cell. Can handle KeepAlives and reads → reduce the master load. Cannot reduce write loads, but they are << 1% of workload. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
  • 91. Scaling - Proxies Proxy is a trusted process that pass requests from other clients to a Chubby cell. Can handle KeepAlives and reads → reduce the master load. Cannot reduce write loads, but they are << 1% of workload. Introduces another point of failure. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 33 / 38
  • 92. Scaling - Partitioning Namespace partitioned between servers. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 34 / 38
  • 93. Scaling - Partitioning Namespace partitioned between servers. N partitions, each with master and replicas Node D/C stored on P(D/C) = hash(D) mod N Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 34 / 38
  • 94. Summary Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 35 / 38
  • 95. Summary Library vs. Service Chubby: coarse-grained lock service Chubby cell and clients Unix-like interface Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 36 / 38
  • 96. References: M. Burrows, The Chubby lock service for loosely-coupled distributed systems, OSDI, 2006. Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 37 / 38
  • 97. Questions? Amir H. Payberah (Tehran Polytechnic) Chubby 1393/7/5 38 / 38