236 lines
No EOL
11 KiB
Markdown
236 lines
No EOL
11 KiB
Markdown
Given objects of type T and an object of type Z, is it possible to wait-free implement Z by using only objects of type T and atomic R/W registers?
|
||
|
||
If yes, then Z is not as essential as T (T can do everything Z can do).
|
||
|
||
But what is the **type** of an object?
|
||
1. the set of all the possible *values for states* of objects of that type
|
||
2. a set of *operations* for manipulating the object, each provided with a *specification*: a description of the conditions under which the operation can be invoked and the effect of the invocation
|
||
|
||
We focus on types whose operations are:
|
||
- **Total:** all operations can be invoked in any state of the object
|
||
- **Sequentially specified:** given the initial state, the behavior depends only by the sequence of operations, where the output to every operation invocation only depends on the input arguments and the invocations preceding it
|
||
- formally, $𝛿(s, op(args)) = {⟨s1,res1⟩,…, ⟨sk,resk⟩}$
|
||
- it is *deterministic* whenever k=1, for every s and every op(args)
|
||
- è deterministica quando dato uno stato e dato un'operazione, posso raggiungere solo uno stato (se raffigurato come grafo, ci sarebbe un solo arco)
|
||
- ricorda un po' gli automi o le Turing Machine
|
||
|
||
An object of type $T_{U}$ is **universal** if every other object can be wait-free implemented using only objects of type $T_U$ and atomic R/W registers.
|
||
|
||
This object is a **consensus object**: a *one-shot object* (every process can access it at most once) that provides only one operation `propose(v)` such that:
|
||
- **Validity:** the returned value (or *decided value*) is one of the arguments of the propose (*proposed value*) in one invocation done by a process (*participant*)
|
||
- **Integrity:** every process decides at most once
|
||
- **Agreement:** the decided value is the same for all processes
|
||
- **Wait-freedom:** every invocation of propose by a correct process terminates
|
||
|
||
Conceptually, we can implement a consensus object by a register X, initialized at ⊥, for which the propose operation is atomically defined as:
|
||
```
|
||
propose(v) :=
|
||
if X = ⊥ then
|
||
X <- v
|
||
return X
|
||
```
|
||
|
||
Universality of consensus holds as folows:
|
||
- given an object O of type Z
|
||
- each participant runs a local copy of O, all initialized at the same value
|
||
- create a total order on the operations O, by using consensus objects
|
||
- force all processes to follow this order to locally simulate O
|
||
- all local copies are consistent
|
||
(O can be any object, e.g. a queue)
|
||
|
||
### First attempt (non wait-free)
|
||
Let us first concentrate on obj's with deterministic specifications (we will see how to handle non-determinism later on)
|
||
|
||
A process $p_i$ that wants to invoke operation op of object Z with arguments args locally runs:
|
||
```
|
||
on the foreground thread of pi:
|
||
|
||
result_i <- ⊥
|
||
invoc_i <- ⟨op(args) , i⟩
|
||
wait result_i != ⊥
|
||
return result_i
|
||
```
|
||
|
||
Every process locally runs also a simulation of Z (stored in the local variable $Z_i$) such that:
|
||
- $Z_{i}$ is initialized at the initial value of Z
|
||
- every process performs on $Z_i$ all the operations performed in Z by all processes, in the same order
|
||
- need of consensus
|
||
|
||
Let CONS be an unbounded array of consensus objects and 𝜋1 and 𝜋2 respectively denote the first and the second element of a given pair (aka the first and second *projection*)
|
||
|
||
The local simulation of Z by $p_i$ is
|
||
|
||
```
|
||
on the background thread of pi:
|
||
|
||
k <- 0
|
||
z_i <- Z.init()
|
||
while true
|
||
if invoc_i ≠ ⊥ then # prosegue se ci sono invocazioni
|
||
k++
|
||
exec_i <- CONS[k].propose(invoc_i)
|
||
⟨z_i , res⟩ <- 𝛿(z_i, 𝜋1(exec_i))
|
||
if 𝜋2(exec_i) = i then
|
||
invoc_i <- ⊥
|
||
result_i <- res
|
||
```
|
||
This solution is non-blocking but not wait-free (can run forever).
|
||
P.S. *invoc_i*, *result_i* e *z_i* sono variabili locali.
|
||
### A wait-free construction
|
||
- `LAST_OP[1..n]:` array of SWMR atomic R/W registers containing pairs init at ⟨⊥,0⟩ ∀i
|
||
- è condiviso e ogni processo ci infila le sue proposal
|
||
- `last_sn_i[1..n]:` local array of the last op by $p_j$ executed by $p_i$ init at 0 ∀i,j
|
||
- ogni processo ha la sua copia locale
|
||
- `last_sn_i[j]` tiene traccia dell'ultima operazione eseguita da j
|
||
|
||
Idea: instead of just proposing my proposal, at every moment I propose all the proposals of all the processes.
|
||
|
||
```⟨⟩
|
||
or(arg) by p_i on Z
|
||
result_i <- ⊥
|
||
LAST_OP[i] <- ⟨op(args), last_sn_i[i]+1⟩
|
||
wait result_i != ⊥
|
||
return result_i
|
||
|
||
l'idea è che scrivo la mia proposal sulla shared memory, così ogni processo può leggere la mia proposal.
|
||
|
||
local simulation of Z by p_i
|
||
k <- 0
|
||
z_i <- Z.init()
|
||
while true
|
||
invoc_i <- 𝜀
|
||
∀ j = 1..n
|
||
if 𝜋2(LAST_OP[j]) > last_sn_i[j] then
|
||
# j ha fatto una nuova proposta
|
||
invoc_i.append(⟨𝜋1(LAST_OP[j]),j⟩)
|
||
# allora la aggiungo alla mia lista invoc_i
|
||
# per eventualmente proporlo come valore dopo
|
||
if invoc_i ≠ 𝜀 then
|
||
# ci sono nuove invocazioni
|
||
# allora partecipo al round
|
||
k++
|
||
exec_i <- CONS[k].propose(invoc_i)
|
||
# sto proponendo UNA LISTA di operazioni
|
||
# e propose in pratica me le mette in ordine
|
||
# così che posso eseguirle nello stesso ordine
|
||
# degli altri processi
|
||
for r=1 to |exec_i|
|
||
⟨z_i,res⟩ -> 𝛿(z_i,𝜋1(exec_i[r]))
|
||
j <- 𝜋2(exec_i[r])
|
||
last_sn_i[j]++
|
||
if i=j then
|
||
result_i <- res
|
||
```
|
||
As before, they are executed by each process on two threads.
|
||
##### Is it correct?
|
||
**Lemma 1:** the construction is wait free
|
||
*Proof:*
|
||
Let pi invoke Z.op(par); to prove the Lemma, we need to show that eventually result_i ≠ ⊥
|
||
|
||
It suffices to prove that ∃ k s.t. `CONS[k].propose(-)` returns a list that contains ⟨“op(par)”,i⟩ and pi participates to the consensus:
|
||
1. since $p_i$ increases the sequence number of `LAST_OP[i]`, eventually all the lists proposed contain that element forever
|
||
2. eventually, $p_i$ will always find `invoc_i != ε` and participates with an increasing sequence number $k_i$
|
||
3. let k be the first consensus where (a) holds
|
||
4. by the properties of consensus, exec_i contains ⟨“op(par)”,i⟩
|
||
|
||
As shown before, every invocation is executed; we have to show that this cannot happen more than once.
|
||
|
||
If $p_i$ has inserted ⟨“op(par)”,sn⟩ in `LAST_OP[i]`, it cannot invoke another operation until ⟨“op(par)”,sn⟩ appears in a list chosen by the consensus
|
||
- the sequence number of `LAST_OP[i]` can increase only after ⟨“op(par)”,sn⟩ has been executed
|
||
|
||
Let k be the minimum such that `CONS[k]` contains ⟨“op(par)”,i⟩
|
||
|
||
Every $p_{j}$ that participates in the k-th consensus increases `last_sn_j[i]` before calculating a new proposal `invoc_j`
|
||
- if `π2(LAST_OP[i])` is not changed, then the guard of the IF is false and ⟨“op(par)”,i⟩ is not appended in `invoc_j`
|
||
- otherwise, we have a new invocation from $p_{i}$ (but this can only happen after that ⟨“op(par)”,sn⟩ has been executed)
|
||
|
||
**Lemma 3:** the local copies $s_{i}$ if all correct processes behave the same and comply with the sequential specification of Z
|
||
|
||
*Proof:*
|
||
- the processes use CONS in the same order `(CONS[1], CONS[2], …)`
|
||
- every consensus object returns the same list to all processes
|
||
- all processes scan the returned list from head to tail
|
||
- then apply the same sequence of operations to their local copies (that started from the same initial value)
|
||
- everything works because of determinism!
|
||
|
||
REMARK: bounded wait freedom does not hold:
|
||
...
|
||
|
||
### Solution for non-deterministic specifications
|
||
If the specifications of Z’s operations are non-deterministic, then 𝛿 does not return one single possible pair after one invocation, but a set of possible choices.
|
||
|
||
How to force every process to run the very same sequence of operations on their local simulations?
|
||
Three possible ways!
|
||
|
||
1. **Brute force solution:** for every pair ⟨s , op(args)⟩, fix a priori one element of 𝛿(⟨s , op(args)⟩) to be chosen
|
||
- literally cancelling non-determinism, but this is not something we want...
|
||
2. additional consensus objects, one for every element of every list
|
||
3. reuse the same consensus object: for all k, `CONS[k]` not only chooses the list of invocations, but also the final state of every invocation
|
||
- the proposals should also pre-calculate the next state and propose one
|
||
|
||
### Binary vs Multivalued Consensus
|
||
Binary consensus: just two possible proposals (say, 0 and 1)
|
||
|
||
##### Multivalued consensus (with unbounded values)
|
||
IDEA:
|
||
- we have n processes and n binary consensus rounds
|
||
- at round k, all processes propose 1 if $p_k$ has proposed something, 0 otherwise
|
||
- if the decided value is 1, then decide $p_k$'s proposal; otherwise, go to the next round.
|
||
|
||
```
|
||
PROP[1..n] array of n proposals, all init at ⊥
|
||
BC[1..n] array of n binary consensus objects
|
||
|
||
mv_propose(i,v) :=
|
||
PROP[i] <- v
|
||
for k=1 to n do
|
||
if PROP[k] ≠ ⊥ then
|
||
res <- BC[k].propose(1)
|
||
else
|
||
res <- BC[k].propose(0)
|
||
if res = 1 then
|
||
return PROP[k]
|
||
wait forever
|
||
```
|
||
why wait forever? Because we want to prove that it is wait free, but not because the loop finishes.
|
||
|
||
##### Validity, Agreement, Integrity, Termination:
|
||
- Let $p_{x}$ the first process that writes a proposal
|
||
- every $p_i$ that participates to the consensus reads the other proposals after that it has written `PROP[i]`
|
||
- all participants starts their for loops after $p_x$ has written its proposal
|
||
- every $p_i$ that participates to the consensus finds `PROP[x] != ⊥` in their for loop
|
||
- `BC[x]` only receives proposals equal to 1
|
||
- because of validity of binary consensus, `BC[x]` returns 1
|
||
- every $p_i$ that participates to the consensus receives 1 at most in its x-th iteration of the for
|
||
- let y (<= x) be the first index such that `BC[y]` returns 1
|
||
- `BC[z] = 0` for all $z < y$
|
||
- since all participating process invoke the binary consensus in the same order, they all decide the value proposed by $p_y$ and terminate.
|
||
|
||
|
||
##### Multivalued consensus (with bounded values)
|
||
Let k be the number of possible proposals and $h = \lceil \log k \rceil$ be the number of bits needed to binary represent them (this value is known to all processes).
|
||
IDEA: decide bit by bit the final outcome
|
||
|
||
```
|
||
PROP[1..n][1..h] array of n h-bits proposals, all init at ⊥ BC[1..h] array of h binary consensus objects
|
||
|
||
bmv_propose(i,v) :=
|
||
PROP[i] <- binary_repr_h(v) # scrivo la rappresentazione binaria
|
||
for k=1 to h do
|
||
P <- {PROP[j][k] | PROP[j]≠⊥ ∧ PROP[j][1..k-1]=res[1..k-1]}
|
||
let b be an element of P
|
||
res[k] <- BC[k].propose(b)
|
||
return value(res)
|
||
```
|
||
(exercise: remove `∧ PROP[j][1..k-1]=res[1..k-1]` and execution in which a consensus property can be broken)
|
||
|
||
*Proof:*
|
||
- Wait freedom: trivial
|
||
- Integrity: trivial
|
||
- Agreement: by agreement of the h binary consensus objects
|
||
- Validity: for all k, we prove that, if dec is the decided value, then there exists a j such that $p_j$ is participating (`PROP[j] ≠ ⊥`) and `dec[1...k] = PROP[j][1...k]`
|
||
- by construction, P contains the k-th bits of the proposals whose first (k-1) bits coincide with the first k-1 bits decides so far:
|
||
- for every b ∈ P, there exists a j such that `PROP[j] ≠ ⊥`,
|
||
- `PROP[j][1..k-1] = dec[1..k-1]` and `PROP[j][k] = b`
|
||
- whatever b ∈ P is selected in the k-th binary consensus, there exists a j such that `PROP[j] ≠ ⊥` and `PROP[j][1..k] = dec[1..k]`
|
||
- by taking `k = h`, we can conclude. |