We now describe the second main model used to describe distributed systems, the shared memory model. To illustrate algorithmic issues in this model we discuss solutions for the mutual exclusion problem.
The shared memory is modeled in terms of a collection of shared variables, commonly referred to as registers. We assume the system contains processors, , and registers . Each processor is modeled as a state machine. Each register has a type, which specifies:
the values it can hold,
the operations that can be performed on it,
the value (if any) to be returned by each operation, and
the new register value resulting from each operation.
Each register can have an initial value.
For example, an integer valued read/write register can take on all integer values and has operations read(R,v) and write(R,v). The read operation returns the value of the last preceding write, leaving unchanged. The write(R,v) operation has an integer parameter , returns no value and changes 's value to . A configuration is a vector , where is a state of and is a value of register . The events are computation steps at the processors where the following happens atomically (indivisibly):
chooses a shared variable to access with a specific operation, based on 's current state,
the specified operation is performed on the shared variable,
's state changes based on its transition function, based on its current state and the value returned by the shared memory operation performed.
A finite sequence of configurations and events that begins with an initial configuration is called an execution. In the asynchronous shared memory system, an infinite execution is admissible if it has an infinite number of computation steps.
In this problem a group of processors need to access a shared resource that cannot be used simultaneously by more than a single processor. The solution needs to have the following two properties. (1) Mutual exclusion: Each processor needs to execute a code segment called a critical section so that at any given time at most one processor is executing it (i.e., is in the critical section). (2) Deadlock freedom: If one or more processors attempt to enter the critical section, then one of them eventually succeeds as long as no processor stays in the critical section forever. These two properties do not provide any individual guarantees to any processor. A stronger property is (3) No lockout: A processor that wishes to enter the critical section eventually succeeds as long as no processor stays in the critical section forever. Original solutions to this problem relied on special synchronisation support such as semaphores and monitors. We will present some of the distributed solutions using only ordinary shared variables.
We assume the program of a processor is partitioned into the following sections:
Entry / Try: the code executed in preparation for entering the critical section.
Critical: the code to be protected from concurrent execution.
Exit: the code executed when leaving the critical section.
Remainder: the rest of the code.
A processor cycles through these sections in the order: remainder, entry, critical and exit. A processor that wants to enter the critical section first executes the entry section. After that, if successful, it enters the critical section. The processor releases the critical section by executing the exit section and returning to the remainder section. We assume that a processor may transition any number of times from the remainder to the entry section. Moreover, variables, both shared and local, accessed in the entry and exit section are not accessed in the critical and remainder section. Finally, no processor stays in the critical section forever. An algorithm for a shared memory system solves the mutual exclusion problem with no deadlock (or no lockout) if the following hold:
Mutual Exclusion: In every configuration of every execution at most one processor is in the critical section.
No deadlock: In every admissible execution, if some processor is in the entry section in a configuration, then there is a later configuration in which some processor is in the critical section.
No lockout: In every admissible execution, if some processor is in the entry section in a configuration, then there is a later configuration in which that same processor is in the critical section.
In the context of mutual exclusion, an execution is admissible if for every processor , either takes an infinite number of steps or ends in the remainder section. Moreover, no processor is ever stuck in the exit section (unobstructed exit condition).
A single bit suffices to guarantee mutual exclusion with no deadlock if a powerful test&set register is used. A test&set variable is a binary variable which supports two atomic operations, test&set and reset, defined as follows:
test&set(: memory address) returns binary value: return () reset(: memory address):
The test&set operation atomically reads and updates the variable. The reset operation is merely a write. There is a simple mutual exclusion algorithm with no deadlock, which uses one test&set register.
Mutual exclusion using one test&set register
Initially equals : 1 wait until : 2
Assume that the initial value of is . In the entry section, processor repeatedly tests until it returns . The last such test will assign to , causing any following test by other processors to return , prohibiting any other processor from entering the critical section. In the exit section resets to ; another processor waiting in the entry section can now enter the critical section.
If a powerful primitive such as test&set is not available, then mutual exclusion must be implemented using only read/write operations.
Lamport's bakery algorithm for mutual exclusion is an early, classical example of such an algorithm that uses only shared read/write registers. The algorithm guarantees mutual exclusion and no lockout for processors using registers (but the registers may need to store integer values that cannot be bounded ahead of time).
Processors wishing to enter the critical section behave like customers in a bakery. They all get a number and the one with the smallest number in hand is the next one to be “served”. Any processor not standing in line has number , which is not counted as the smallest number.
The algorithm uses the following shared data structures: Number is an array of integers, holding in its -th entry the current number of processor . Choosing is an array of boolean values such that is true while is in the process of obtaining its number. Any processor that wants to enter the critical section attempts to choose a number greater than any number of any other processor and writes it into . To do so, processors read the array Number and pick the greatest number read as their own number. Since however several processors might be reading the array at the same time, symmetry is broken by choosing (, ) as 's ticket. An ordering on tickets is defined using the lexicographical ordering on pairs. After choosing its ticket, waits until its ticket is minimal: For all other , waits until is not in the process of choosing a number and then compares their tickets. If 's ticket is smaller, waits until executes the critical section and leaves it.
Code for processor , . Initially and
FALSE, for : 1
WAIT UNTILor : 7
algorithm requires the use of unbounded values. We next present an algorithm that removes this requirement. In this algorithm, first presented by Peterson and Fischer, processors compete pairwise using a two-processor algorithm in a tournament tree arrangement. All pairwise competitions are arranged in a complete binary tree. Each processor is assigned to a specific leaf of the tree. At each level, the winner in a given node is allowed to proceed to the next higher level, where it will compete with the winner moving up from the other child of this node (if such a winner exists). The processor that finally wins the competition at the root node is allowed to enter the critical section.
Let . Consider a complete binary tree with leaves and a total of nodes. The nodes of the tree are numbered inductively in the following manner: The root is numbered ; the left child of node numbered is numbered and the right child is numbered . Hence the leaves of the tree are numbered .
With each node , three binary shared variables are associated: , and . All variables have an initial value of . The algorithm is recursive. The code of the algorithm consists of a procedure
which is executed when a processor accesses node , while assuming the role of processor . Each node has a critical section. It includes the entry section at all the nodes on the path from the nodes parent to the root, the original critical section and the exit code on all nodes from the root to the nodes parent. To begin, processor executes the code of node .
Node(: integer; side: ) 1 2
WAIT UNTIL( or ) 3 4
THENgoto line 1 7
Node() 11 12 end procedure
This algorithm uses bounded values and as the next theorem shows, satisfies the mutual exclusion, no lockout properties:
Proof. Consider any execution. We begin at the nodes closest to the leaves of the tree. A processor enters the critical section of this node if it reaches line 9 (it moves up to the next node). Assume we are at a node that connects to the leaves where and start. Assume that two processors are in the critical section at some point. It follows from the code that then at this point. Assume, without loss of generality that 's last write to before entering the critical section follows 's last write to before entering the critical section. Note that can enter the critical section (of ) either through line 5 or line 6. In both cases reads . However 's read of , follows 's write to , which by assumption follows 's write to . Hence 's read of should return , a contradiction.
The claim follows by induction on the levels of the tree.
Proof. Consider any admissible execution. Assume that some processor is starved. Hence from some point on is forever in the entry section. We now show that cannot be stuck forever in the entry section of a node . The claim then follows by induction.
Case 1: Suppose executes line 10 setting to 0. Then equals forever after. Thus passes the test in line 2 and skips line 5. Hence must be waiting in line 6, waiting for to be 0, which never occurs. Thus is always executing between lines 3 and 11. But since does not stay in the critical section forever, this would mean that is stuck in the entry section forever which is impossible since will execute line 5 and reset to 0.
Case 2: Suppose never executes line 10 at some later point. Hence must be waiting in line 6 or be in the remainder section. If it is in the entry section, passes the test in line 2 ( is 1). Hence does not reach line 6. Therefore waits in line 2 with . Hence passes the test in line 6. So cannot be forever in the entry section. If is forever in the remainder section equals 0 henceforth. So cannot be stuck at line 2, 5 or 6, a contradiction.
The claim follows by induction on the levels of the tree.
So far, all deadlock-free mutual exclusion algorithms presented require the use of at least shared variables, where is the number of processors. Since it was possible to develop an algorithm that uses only bounded values, the question arises whether there is a way of reducing the number of shared variables used. Burns and Lynch first showed that any deadlock-free mutual exclusion algorithm using only shared read/write registers must use at least shared variables, regardless of their size. The proof of this theorem allows the variables to be multi-writer variables. This means that each processor is allowed to write to each variable. Note that if the variables are single writer, that the theorem is obvious since each processor needs to write something to a (separate) variable before entering the critical section. Otherwise a processor could enter the critical section without any other processor knowing, allowing another processor to enter the critical section concurrently, a contradiction to the mutual exclusion property.
The proof by Burns and Lynch introduces a new proof technique, a covering argument: Given any no deadlock mutual exclusion algorithm , it shows that there is some reachable configuration of in which each of the processors is about to write to a distinct shared variable. This is called a covering of the shared variables. The existence of such a configuration can be shown using induction and it exploits the fact that any processor before entering the critical section, must write to at least one shared variable. The proof constructs a covering of all shared variables. A processor then enters the critical section. Immediately thereafter the covering writes are released so that no processor can detect the processor in the critical section. Another processor now concurrently enters the critical section, a contradiction.
In all mutual exclusion algorithms presented so far, the number of steps taken by processors before entering the critical section depends on , the number of processors even in the absence of contention (where multiple processors attempt to concurrently enter the critical section), when a single processor is the only processor in the entry section. In most real systems however, the expected contention is usually much smaller than .
A mutual exclusion algorithm is said to be fast if a processor enters the critical section within a constant number of steps when it is the only processor trying to enter the critical section. Note that a fast algorithm requires the use of multi-writer, multi-reader shared variables. If only single writer variables are used, a processor would have to read at least variables.
Such a fast mutual exclusion algorithm is presented by Lamport.
Code for processor , . Initially Fast-Lock and Slow-Lock are , and is false for all , : 1
WAIT UNTIL6 goto 1 7 8
FALSE10 for all ,
WAIT UNTIL13 goto 1 : 14 15
Lamport's algorithm is based on the correct combination of two mechanisms, one for allowing fast entry when no contention is detected, and the other for providing deadlock freedom in the case of contention. Two variables, Fast-Lock and Slow-Lock are used for controlling access when there is no contention. In addition, each processor has a boolean variable whose value is true if is interested in entering the critical section and false otherwise. A processor can enter the critical section by either finding - in this case it enters the critical section on the fast path - or by finding in which case it enters the critical section along the slow path.
Consider the case where no processor is in the critical section or in the entry section. In this case, Slow-Lock is and all Want entries are . Once now enters the entry section, it sets to and Fast-Lock to . Then it checks Slow-Lock which is . then it checks Fast-Lock again and since no other processor is in the entry section it reads and enters the critical section along the fast path with three writes and two reads.
If then waits until all Want flags are reset. After some processor executes the for loop in line , the value of Slow-Lock remains unchanged until some processor leaving the critical section resets it. Hence at most one processor may find and this processor enters the critical section along the slow path. Note that the Lamport's Fast Mutual Exclusion algorithm does not guarantee lockout freedom.
13.8-1 An algorithm solves the 2-mutual exclusion problem if at any time at most two processors are in the critical section. Present an algorithm for solving the 2-mutual exclusion problem using test & set registers.
13.8-4 Isolate a bounded mutual exclusion algorithm with no lockout for two processors from the tournament tree algorithm. Show that your algorithm has the mutual exclusion property. Show that it has the no lockout property.
13.8-5 Prove that algorithm
has the mutual exclusion property.
13.8-6 Prove that algorithm
has the no deadlock property.
13.8-7 Show that algorithm
does not satisfy the no lockout property, i.e. construct an execution in which a processor is locked out of the critical section.
13.8-8 Construct an execution of algorithm
in which two processors are in the entry section and both read at least variables before entering the critical section.
Prove that the algorithm
sends messages in any execution, given a graph with vertices and edges. What is the exact number of messages as a function of the number of vertices and edges in the graph?
Assume that messages can only be sent in CW direction, and design an asynchronous algorithm for leader election on a ring that has message complexity.
Hint. Let processors work in phases. Each processor begins in the active mode with a value equal to the identifier of the processor, and under certain conditions can enter the relay mode, where it just relays messages. An active processor waits for messages from two active processors, and then inspects the values sent by the processors, and decides whether to become the leader, remain active and adopt one of the values, or start relaying. Determine how the decisions should be made so as to ensure that if there are three or more active processors, then at least one will remain active; and no matter what values active processors have in a phase, at most half of them will still be active in the next phase.
Show that the validity condition is equivalent to requiring that every nonfaulty processor decision be the input of some processor.
An alternative version of the consensus problem requires that the input value of one distinguished processor (the general) be distributed to all the other processors (the lieutenants). This problem is also called single source consensus problem. The conditions that need to be satisfied are:
Termination: Every nonfaulty lieutenant must eventually decide,
Agreement: All the nonfaulty lieutenants must have the same decision,
Validity: If the general is nonfaulty, then the common decision value is the general's input.
So if the general is faulty, then the nonfaulty processors need not decide on the general's input, but they must still agree with each other. Consider the synchronous message passing system with Byzantine faults. Show how to transform a solution to the consensus problem (in Subsection 13.4.5) into a solution to the general's problem and vice versa. What are the message and round overheads of your transformation?
Imagine that there are banks that are interconnected. Each bank starts with an amount of money . Banks do not remember the initial amount of money. Banks keep on transferring money among themselves by sending messages of type ≤10≥ that represent the value of a transfer. At some point of time a bank decides to find the total amount of money in the system. Design an algorithm for calculating that does not stop monetary transactions.
The definition of the distributed systems presented in the chapter are derived from the book by Attiya and Welch [ 24 ]. The model of distributed computation, for message passing systems without failures, was proposed by Attiya, Dwork, Lynch and Stockmeyer [ 23 ].
Modeling the processors in the distributed systems in terms of automata follows the paper of Lynch and Fisher [ 229 ].
is presented after the paper due to Segall [
The two generals problem is presented as in the book of Gray [ 144 ].
One of the basic results in the theory of asynchronous systems is that the consensus problem is not solvable even if we have reliable communication systems, and one single faulty processor which fails by crashing. This result was first shown in a breakthrough paper by Fischer, Lynch and Paterson [ 108 ].
is based on the paper of Dolev and Strong [
Berman and Garay [ 40 ] proposed an algorithm for the solution of the Byzantine consensus problem for the case . Their algorithm needs rounds.
The bakery algorithm [ 212 ] for mutual exclusion using only shared read/write registers to solve mutual exclusion is due to Lamport [ 212 ]. This algorithm requires arbitrary large values. This requirement is removed by Peterson and Fischer [ 270 ]. After this Burns and Lynch proved that any deadlock-free mutual exclusion algorithm using only shared read/write registers must use at least shared variables, regardless of their size [ 52 ].
Important textbooks on distributed algorithms include the monumental volume by Nancy Lynch [ 228 ] published in 1997, the book published by Gerard Tel [ 320 ] in 2000, and the book by Attiya and Welch [ 24 ]. Also of interest is the monograph by Claudia Leopold [ 221 ] published in 2001, and the book by Nicola Santoro [ 296 ], which appeared in 2006.
A recent book on the distributed systems is due to A. D. Kshemkalyani and M. [ 206 ].
Finally, several important open problems in distributed computing can be found in a recent paper of Aspnes et al. [ 21 ].