138 lines
No EOL
8 KiB
Markdown
138 lines
No EOL
8 KiB
Markdown
Can we take the most basic protocol that satisfies the most basic liveness property (obstruction freedom) and "upgrade" it to bounded wait freedom?
|
||
|
||
**Contention manager:** is an object that allows progress of processes by providing contention-free periods for completing their invocations. It provides 2 operations:
|
||
- `need_help(i)`: invoked by $p_i$ when it discovers that there is contention
|
||
- `stop_help(i)`: invoked by $p_{i}$ when it terminates its current invocation
|
||
|
||
**Enriched implementation:** when a process realizes that there is contention, it invokes need_help; when it completes its current operation, it invokes stop_help.
|
||
|
||
>[!question] Why is it different from lock/unlock?
|
||
>Because this allows failures, and they can also happen in the contention-free period.
|
||
|
||
**PROBLEM:** to distinguish a failure from a long delay, we need objects called ***failure detectors*** (FDs), that provide processes information on the failed processes of the system. According to the type/quality of the info, several FDs can be defined.
|
||
|
||
**Eventually restricted leadership:** given a non-empty set of process IDs X, the failure detector $\Omega_{X}$ provides each process a local variable `ev_leader(X)` such that:
|
||
1. *(Validity)* `ev_leader(x)` always contains a process ID
|
||
2. *(Eventual leadership)* eventually, all `ev_leader(X)` of all non-crashed processes of X for ever contain the same process ID, that is one of them
|
||
|
||
REMARK: the moment in which all variables contain the same leader is unknown.
|
||
|
||
### From obstruction-freedom to non-blocking
|
||
```
|
||
NEED_HELP[1..n] : SWMR atomic R/W boolean registers init at false
|
||
|
||
need_help(i) :=
|
||
NEED_HELP[i] <- true
|
||
repeat
|
||
X <- {j : NEED_HELP[j]}
|
||
until ev_leader(X) = i # loopa finché non è lui stesso il leader (PER TUTTI)
|
||
|
||
stop_help(i) :=
|
||
NEED_HELP[i] <- false
|
||
```
|
||
|
||
**Theorem:** the contention manager just seen transforms an obstr.-free implementation into a non-blocking enriched implementation.
|
||
|
||
*Proof:*
|
||
By contr., $\exists \tau$ s.t. $\exists$ many (> 0) op.'s invoked concurrently that never terminate.
|
||
Let Q be the set of proc.'s that performed these invocations.
|
||
- by enrichment (def. at the beginning), eventually `NEED_HELP[i]=true` forever ($\forall i\in Q$)
|
||
- since crashes are fail-stop, eventually `NEED_HELP[j]` is no longer modified ($\forall j \not \in Q$)
|
||
- $\exists \tau' \geq \tau$ where all proc.'s in Q compute the same X
|
||
|
||
**Observation:** $Q \subseteq X$ (it is possible that $p_j$ sets `NEED_HELP[j]` and then fails)
|
||
|
||
By definition of $\Omega_{X}, \exists \tau'' \geq t'$ s.t. all proc.'s in Q have the same `ev_leader(X)`
|
||
- the leader belongs to Q, since
|
||
- it is involved in the contention
|
||
- it can't be failed (by definition of `ev_leader`)
|
||
- this is the only process allowed to proceed
|
||
- because run in isolation, it eventually terminates (because of obstruction freedom)
|
||
|
||
#### On implementing $\Omega$
|
||
It can be proved that there exists no wait-free implementation of $\Omega$ in an asynchronous system with atomic R/W registers and any number of crashes
|
||
- crashes are indistinguishable from long delays
|
||
- need of timing constraints
|
||
|
||
1. $\exists$ time $\tau_{1}$, time interval $\nabla$ and correct process $p_{L}$ s.t. after $\tau_{1}$ every two consecutive writes to a specific SWMR atomic R/W by $p_{L}$ are at most $\nabla$ time units apart one from the other
|
||
(in pratica esiste un processo che non fallisce che aggiorna un registro almeno ogni $\nabla$, che ci permette di essere sicuri che non sia crashato ma stia facendo le sue cose)
|
||
|
||
2. let t be an upper bound on the number of possible failing processes and f the real number of process failed (hence, $0\leq f\leq t\leq n-1$, with f unknown and t known in advance).
|
||
Then, there are at least $t-f$ correct processes different from $p_L$ with a timer s.t. $\exists$ time $\tau_{2}$ for each time interval $\delta$, if their timer is set to $\delta$ after $\tau_{2}$, it expires at least after $\delta$.
|
||
(stiamo dicendo che il timer scade sicuramente dopo $\delta$, il che ci permette di non considerare erroneamente come fallito il processo. Perché non esattamente a $\delta$? Perché è un sistema asincrono e non c'è un clock globale)
|
||
|
||
>[!warning] Remark
|
||
$\tau_{1}, \tau_{2}, \nabla$ and $p_L$ are all unknown :/
|
||
|
||
IDEA:
|
||
- `PROGRESS[1..n]` is an array of SWMR atomic registers used by proc’s to signal that they’re alive
|
||
- $p_{i}$ suspects $p_j$ if pi doesn’t see any progress of $p_{j}$ after a proper time interval (to be guessed) set in its timer
|
||
- the leader is the least suspected process, or the one with smallest/biggest ID among the least suspected ones (if there are more than one)
|
||
- this changes in time, but not forever (can be proved, but it's not covered here)
|
||
|
||
Guessing the time duration for suspecting a process:
|
||
- `SUSPECT[i,j]` = # of times $p_i$ has suspected $p_j$
|
||
- For all k, take the t+1 minimum values in `SUSPECT[1..n , k]`
|
||
- remember: t is the max number of processes that can fail (it is known)
|
||
- so doing this we ensure considering at least a process that doesn't fail
|
||
- Sum them, to obtain $S_{k}$ (for each process k)
|
||
- The interval to use in the timers is the minimum $S_{k}$
|
||
- it can be proved that this eventually becomes ≥ $\nabla$
|
||
|
||
### From obstruction-freedom to wait-freedom
|
||
**Eventually perfect:** failure detector ♢P provides each process $p_i$ a local variable $suspected_i$ such that
|
||
1. *(Eventual completeness)* eventually, $suspected_{i}$ contains all the indexes of crashed processes, for all correct $p_i$
|
||
2. (*Eventual accuracy*) eventually, $suspected_{i}$ contains only indexes of crashed processes, for all correct $p_{i}$.
|
||
|
||
**Definition:** A failure detector FD1 is **stronger** than a failure detector FD2 if there exists an algorithm that builds FD2 from instances of FD1 and atomic R/W registers.
|
||
|
||
**Proposition:** ♢P is stronger than $\Omega_{X}$
|
||
*Proof:*
|
||
Forall i
|
||
- i ∉ X $\to$ `ev_leader_i(X)` is any ID (and may change in time)
|
||
- $i \in X \to$ `ev_leader_i(X)` $= min\left(( Π \setminus suspected_{i}) \cap X \right)$ where $Π$ denotes the set of all proc. IDs.
|
||
|
||
$\Omega_{X}$ is not stronger than ♢P (so, ♢P is strictly stronger)
|
||
|
||
The formal proof consists in showing that if $\Omega_{X}$ was stronger than ♢P, then consensus would be possible in an asynchronous system with crashes and atomic R/W registers.
|
||
(because ♢P can be used to solve consensus)
|
||
#### A contention manager for ♢P
|
||
We assume a weak timestamp generator, i.e. a function such that, if it returns a positive value t to some process, only a finite number of invocations can obtain a timestamp smaller than or equal to t
|
||
|
||
```
|
||
TS[1..n] : SWMR atomic R/W registers init at 0
|
||
|
||
need_help(i) :=
|
||
TS[i] <- weak_ts()
|
||
repeat
|
||
competing <- {j : TS[j] != 0 and j ∉ suspected_i}
|
||
⟨t,j⟩ <- min{⟨TS[x],x⟩ | x ∈ competing}
|
||
until j = i
|
||
|
||
stop_help(i) :=
|
||
TS[i] <- 0
|
||
```
|
||
|
||
##### Theorem:
|
||
the contention manager just seen transforms an obstruction-free implementation into a wait-free enriched implementation.
|
||
|
||
*Proof:*
|
||
By contradiction, $\exists$ an invocation of a correct $p_i$ that never terminates.
|
||
Let $ti$ be its timestamp.
|
||
- choose the minimum of such ⟨ti,i⟩
|
||
|
||
By constraint of weak_ts(), the set of invocations smaller than ⟨ti,i⟩ (call it I) is finite
|
||
- for every invocation ∈ I from a process $p_j$ that crashed during its execution
|
||
- $p_i$ will eventually and forever suspect $p_j$ (i.e. j ∈ `suspected_i`)
|
||
- eventually j will not be anymore in completing
|
||
- $p_{j}$ will not anymore prevent $p_i$ from proceeding
|
||
|
||
Since ⟨ti,i⟩ is the minimum index of a non-terminating invocation
|
||
- all invocations in I of correct processes terminate
|
||
- if such processes invoke need_help() again, they obtain greater indexes
|
||
- eventually I gets emptied
|
||
|
||
Since $p_i$ is correct, eventually (for all $p_k$ correct):
|
||
- i will not be anymore in $suspected_{i}$ (as $p_i$ will not crash as its correct)
|
||
- $⟨ti,i⟩ = min\{⟨TS[x],x⟩ \space|\space x \in competing_{k}\}$ (all pk will select pi as the minimum)
|
||
Hence, the invocation with index ⟨ti,i⟩ will eventually have exclusive execution, and because of obstruction freedom it eventually terminates. |