master-degree-notes/Concurrent Systems/notes/9 - Consensus.md

8.7 KiB
Raw Blame History

Given objects of type T and an object of type Z, is it possible to wait-free implement Z by using only objects of type T and atomic R/W registers?

If yes, then Z is not as essential as T (T can do everything Z can do).

But what is the type of an object?

  1. the set of all the possible values for states of objects of that type
  2. a set of operations for manipulating the object, each provided with a specification: a description of the conditions under which the operation can be invoked and the effect of the invocation

We focus on types whose operations are:

  • Total: all operations can be invoked in any state of the object
  • Sequentially specified: given the initial state, the behavior depends only by the sequence of operations, where the output to every operation invocation only depends on the input arguments and the invocations preceding it
    • formally, 𝛿(s, op(args)) = {⟨s1,res1⟩,…, ⟨sk,resk⟩}
    • it is deterministic whenever k=1, for every s and every op(args)
      • è deterministica quando dato uno stato e dato un'operazione, posso raggiungere solo uno stato (se raffigurato come grafo, ci sarebbe un solo arco)
    • ricorda un po' gli automi o le Turing Machine

An object of type T_{U} is universal if every other object can be wait-free implemented using only objects of type T_U and atomic R/W registers.

This object is a consensus object: a one-shot object (every process can access it at most once) that provides only one operation propose(v) such that:

  • Validity: the returned value (or decided value) is one of the arguments of the propose (proposed value) in one invocation done by a process (participant)
  • Integrity: every process decides at most once
  • Agreement: the decided value is the same for all processes
  • Wait-freedom: every invocation of propose by a correct process terminates

Conceptually, we can implement a consensus object by a register X, initialized at ⊥, for which the propose operation is atomically defined as:

propose(v) :=
	if X = ⊥ then
		X <- v
		return X

Universality of consensus holds as folows:

  • given an object O of type Z
  • each participant runs a local copy of O, all initialized at the same value
  • create a total order on the operations O, by using consensus objects
  • force all processes to follow this order to locally simulate O
    • all local copies are consistent (O can be any object, e.g. a queue)

First attempt (non wait-free)

Let us first concentrate on obj's with deterministic specifications (we will see how to handle non-determinism later on)

A process p_i that wants to invoke operation op of object Z with arguments args locally runs:

on the foreground thread of pi:

result_i <- ⊥
invoc_i <- ⟨op(args) , i⟩
wait result_i != ⊥
return result_i

Every process locally runs also a simulation of Z (stored in the local variable Z_i) such that:

  • Z_{i} is initialized at the initial value of Z
  • every process performs on Z_i all the operations performed in Z by all processes, in the same order
    • need of consensus

Let CONS be an unbounded array of consensus objects and 𝜋1 and 𝜋2 respectively denote the first and the second element of a given pair (aka the first and second projection)

The local simulation of Z by p_i is

on the background thread of pi:

k <- 0
z_i <- Z.init()
while true
	if invoc_i ≠ ⊥ then # prosegue se ci sono invocazioni
		k++
		exec_i <- CONS[k].propose(invoc_i)
		⟨z_i , res⟩ <- 𝛿(z_i, 𝜋1(exec_i))
		if 𝜋2(exec_i) = i then
			invoc_i <- ⊥
			result_i <- res

This solution is non-blocking but not wait-free (can run forever). P.S. invoc_i, result_i e z_i sono variabili locali.

A wait-free construction

  • LAST_OP[1..n]: array of SWMR atomic R/W registers containing pairs init at ⟨⊥,0⟩ ∀i
    • è condiviso e ogni processo ci infila le sue proposal
  • last_sn_i[1..n]: local array of the last op by p_j executed by p_i init at 0 ∀i,j
    • ogni processo ha la sua copia locale
    • last_sn_i[j] tiene traccia dell'ultima operazione eseguita da j

Idea: instead of just proposing my proposal, at every moment I propose all the proposals of all the processes.

or(arg) by p_i on Z
	result_i <- ⊥
	LAST_OP[i] <- ⟨op(args), last_sn_i[i]+1⟩
	wait result_i != ⊥
	return result_i

l'idea è che scrivo la mia proposal sulla shared memory, così ogni processo può leggere la mia proposal.

local simulation of Z by p_i
	k <- 0
	z_i <- Z.init()
	while true
		invoc_i <- 𝜀
		∀ j = 1..n
			if 𝜋2(LAST_OP[j]) > last_sn_i[j] then
				# j ha fatto una nuova proposta
				invoc_i.append(⟨𝜋1(LAST_OP[j]),j⟩)
				# allora la aggiungo alla mia lista invoc_i
				# per eventualmente proporlo come valore dopo
		if invoc_i ≠ 𝜀 then
			# ci sono nuove invocazioni
			# allora partecipo al round
			k++
			exec_i <- CONS[k].propose(invoc_i)
			# sto proponendo UNA LISTA di operazioni
			# e propose in pratica me le mette in ordine
			# così che posso eseguirle nello stesso ordine
			# degli altri processi
			for r=1 to |exec_i|
				⟨z_i,res⟩ -> 𝛿(z_i,𝜋1(exec_i[r]))
				j <- 𝜋2(exec_i[r])
				last_sn_i[j]++
				if i=j then 
					result_i <- res

As before, they are executed by each process on two threads.

Is it correct?

Lemma 1: the construction is wait free Proof: Let pi invoke Z.op(par); to prove the Lemma, we need to show that eventually result_i ≠ ⊥

It suffices to prove that ∃ k s.t. CONS[k].propose(-) returns a list that contains ⟨“op(par)”,i⟩ and pi participates to the consensus:

  1. since p_i increases the sequence number of LAST_OP[i], eventually all the lists proposed contain that element forever
  2. eventually, p_i will always find invoc_i != ε and participates with an increasing sequence number k_i
  3. let k be the first consensus where (a) holds
  4. by the properties of consensus, exec_i contains ⟨“op(par)”,i⟩

As shown before, every invocation is executed; we have to show that this cannot happen more than once.

If p_i has inserted ⟨“op(par)”,sn⟩ in LAST_OP[i], it cannot invoke another operation until ⟨“op(par)”,sn⟩ appears in a list chosen by the consensus

  • the sequence number of LAST_OP[i] can increase only after ⟨“op(par)”,sn⟩ has been executed

Let k be the minimum such that CONS[k] contains ⟨“op(par)”,i⟩

Every p_{j} that participates in the k-th consensus increases last_sn_j[i] before calculating a new proposal invoc_j

  • if π2(LAST_OP[i]) is not changed, then the guard of the IF is false and ⟨“op(par)”,i⟩ is not appended in invoc_j
  • otherwise, we have a new invocation from p_{i} (but this can only happen after that ⟨“op(par)”,sn⟩ has been executed)

Lemma 3: the local copies s_{i} if all correct processes behave the same and comply with the sequential specification of Z

Proof:

  • the processes use CONS in the same order (CONS[1], CONS[2], …)
  • every consensus object returns the same list to all processes
  • all processes scan the returned list from head to tail
    • then apply the same sequence of operations to their local copies (that started from the same initial value)
    • everything works because of determinism!

REMARK: bounded wait freedom does not hold: ...

Solution for non-deterministic specifications

If the specifications of Zs operations are non-deterministic, then 𝛿 does not return one single possible pair after one invocation, but a set of possible choices.

How to force every process to run the very same sequence of operations on their local simulations? Three possible ways!

  1. Brute force solution: for every pair ⟨s , op(args)⟩, fix a priori one element of 𝛿(⟨s , op(args)⟩) to be chosen
    • literally cancelling non-determinism, but this is not something we want...
  2. additional consensus objects, one for every element of every list
  3. reuse the same consensus object: for all k, CONS[k] not only chooses the list of invocations, but also the final state of every invocation
    • the proposals should also pre-calculate the next state and propose one

Binary vs Multivalued Consensus

Binary consensus: just two possible proposals (say, 0 and 1)

Multivalued consensus (with unbounded values)

IDEA:

  • we have n processes and n binary consensus rounds
  • at round k, all processes propose 1 if p_k has proposed something, 0 otherwise
  • if the decided value is 1, then decide $p_k$'s proposal; otherwise, go to the next round.
PROP[1..n] array of n proposals, all init at ⊥
BC[1..n] array of n binary consensus objects

mv_propose(i,v) :=
	PROP[i] <- v
	for k=1 to n do
		if PROP[k] ≠ ⊥ then
			res <- BC[k].propose(1)
		else
			res <- BC[k].propose(0)
		if res = 1 then
			return PROP[k]
	wait forever