166 lines
7.3 KiB
Markdown
166 lines
7.3 KiB
Markdown
|
|
The idea is to complement the weaknesses of a system with the strengths of another.
|
|
|
|
Examples:
|
|
- **multiple biometric traits** (e.g. signature + fingerprint, used in India, USA ecc.)
|
|
- most obvious meaning of multi biometrics
|
|
- **multiple instances:** same trait but acquired in different nuances (i.e. 2 or more different fingers, both irises, both ears, multiple instances of hand geometry...)
|
|
- **repeated instances:** same trait, same element, but acquired multiple times
|
|
- **multiple algorithms:** same trait, same element but using multiple classifiers
|
|
- exploits strengths and weaknesses
|
|
- **multiple sensors:** i.e. fingerprint with both optical and capacitive sensor
|
|
|
|
Where do the fusion happen?
|
|
It can happen
|
|
- **at sensor level:**
|
|
- not always feasible
|
|
- **at feature level:** fusing feature vectors before matching
|
|
- not always feasible: feature vectors should be comparable in nature and size
|
|
- an example is when we have multiple samples of the same traits, in this case they will be certainly comparable
|
|
- **score level fusion:** or match level fusion. Consists in fusing the scores (probability scores) or rankings
|
|
- most feasible solution
|
|
- each system works by itself
|
|
- scores need to be comparable: normalization in a common range may be required
|
|
- **decision level fusion:** separate decisions (look at slide)
|
|
|
|
![[Pasted image 20241212084256.png|500]]
|
|
|
|
#### Feature level fusion
|
|
![[Pasted image 20241212084349.png|600]]
|
|
|
|
Better results are expected, since much more information is still present
|
|
Possible problems:
|
|
- incompatible feature set
|
|
- feature vector combination may cause "curse of dimensionality"
|
|
- a more complex matcher may be required
|
|
- combined vectors may include noisy or redundant data.
|
|
|
|
|
|
##### Feature level fusion: serial
|
|
example: use SIFT (scalar invariant feature transform)
|
|
Phases:
|
|
- feature extraction (SIFT feature set)
|
|
- feature normalization: required due to the possible significant differences in the scale of the vector values
|
|
- si crea un vettore solo composto dai due feature vector
|
|
|
|
Problems to address:
|
|
- **feature selection / reduction**
|
|
- è più efficiente scegliere poche feature rispetto all'intero vettore, si possono usare tecniche come
|
|
- **clutering k-means** mantenendo solo i centri dei cluster
|
|
- performed after linking the two normalized vectors
|
|
- **neighborhood elimination**
|
|
- points at a certain distance are eliminated
|
|
- performed before linking, on the single vectors
|
|
- **points belonging to specific regions**
|
|
- only points in specific regions of the train (e.g. face, nose, mouth...) are maintained
|
|
- **matching**
|
|
- **point pattern matching**
|
|
- method to find the number of paired "points" between the probe vector and the gallery one
|
|
- two points are paired if their distance is smaller than a threshold
|
|
|
|
##### Feature level fusion: parallel
|
|
parallel combination of the two vectors:
|
|
- **vector normalization**
|
|
- shorter vector is extended to match the size of the other one
|
|
- e.g. zero-padding
|
|
- **pre-processing of vectors
|
|
- step 1: transform vectors in unitary vectors (dividing them by their L2 norm)
|
|
- step 2: weighted combination through the coefficient $\theta$, based on the lenght of X and Y
|
|
- we can then use X as the real part and Y as the imaginary part of the final vector
|
|
- **further feature processing:**
|
|
- using linear techniques like PCA, L-L expansion, LDA
|
|
|
|
##### Feature level fusion: CCA
|
|
The idea is to find a pair of transformations that maximizes the correlation between characteristics
|
|
|
|
#### Score level fusion
|
|
![[Pasted image 20241212085003.png]]
|
|
|
|
Transformation based: scores from different matchers are first normalized in a common domain and then combined using fusion rules
|
|
|
|
Classifier based: the scores are considered as features and included into a feature vector. A further classifier is trained (can be SVM, decision tree, neural netework...)
|
|
|
|
##### Fusion rules
|
|
**Abstract:** each classifier outputs a class label
|
|
Majority vote: each classifier votes for a class
|
|
|
|
**Rank:** each classifier outputs its class rank
|
|
Borda count:
|
|
- each classifier produces a ranking (classifica) according to the probability of the pattern belonging to them
|
|
- ranking are converted in scores and summed up
|
|
- the class with the highest final score is the one chosen by the multi-classifier
|
|
|
|
es. su 4 posti disponibili, la classe più probabile ha rank 4, quella meno probabile rank 1. I rank di ogni classificatore si sommano.
|
|
Can also be used in identification open set, using a threshold to discard low scores (score is the sum of ranks)
|
|
|
|
**Measurement:** each classifier outputs its classification score
|
|
![[Pasted image 20241212090608.png|600]]
|
|
|
|
Different methods are possible (i.e. sum, weighted sum, mean, product, weighted product, max, min, ecc.)
|
|
|
|
- sum: the sum of the returned confidence vectors is computed, pattern is classified according to the highest value
|
|
|
|
Scores from different matchers are typically unhomogeneous:
|
|
- different range
|
|
- similarity vs distance
|
|
- different distributions
|
|
|
|
Normalization is required!
|
|
But there are issues to consider when choosing a normalization method:
|
|
- robustness: the transformation should not be influenced by outliers
|
|
- effectiveness: estimated parameters for the score distribution should be best approximate the real values
|
|
|
|
##### Reliability
|
|
A reliability measure for each single response of each subsystem before fusing them in a final response. Confidence margins being a possible solution.
|
|
Poh e Bengio: solution based on FAR and FRR $M(\nabla) = |FAR(\nabla)-FR{R}(\nabla)|$
|
|
|
|
#### Decision level fusion
|
|
![[Pasted image 20241212091320.png|600]]
|
|
|
|
A common way is majority voting. But also serial combination (AND) or parallel combination (OR) can be used.
|
|
Be careful when using OR: if a single classifier says ok but the other fails, it is accepted (less secure)!
|
|
|
|
#### Template updating - Co-Update method
|
|
mi sono distratto, integrare con slide
|
|
|
|
#### Data normalization
|
|
|
|
When minimum and maximum values are known, normalization is trivial.
|
|
For this reason, we assumed to **miss** an exact estimate of the maximum value.
|
|
We chose the average value in its place, in order to stress normalization functions even more.
|
|
|
|
Normalization functions:
|
|
- min/max
|
|
- $s'_{k}=\frac{s_{k}-min}{max-min}$
|
|
- z-score
|
|
-
|
|
- median/mad
|
|
-
|
|
- sigmoid
|
|
-
|
|
- tanh
|
|
|
|
![[Pasted image 20241212094046.png|300]]
|
|
|
|
The Min-max normalization technique performs a “mapping” (shifting + compression/dilation) of the interval between the minimum and maximum values in the interval between 0 and 1
|
|
Pro: range tra 0 e 1
|
|
Contro: bisogna conoscere minimo e massimo dello score di ogni sottosistema
|
|
![[Pasted image 20241212093902.png|200]]
|
|
|
|
Standardizzazione per media e varianza, ampiamente usato
|
|
contro: non porta lo score in un range fisso
|
|
![[Pasted image 20241212093927.png|200]]
|
|
|
|
median/MAD: si sottrae la mediana e si divide per la mediana dei valori assoluti
|
|
funziona male se la distribuzione degli score non è gaussiana. Non preserva la distribuzione originale e non garantisce nemmeno un range fisso :/
|
|
![[Pasted image 20241212093943.png|200]]
|
|
|
|
Sigmoide: porta nell'intervallo aperto (0, 1)
|
|
contro 1: verso gli estremi distorce parecchio
|
|
contro 2: dipende dai parametri k e c che dipendono a sua volta dalla distribuzione degli score
|
|
![[Pasted image 20241212094000.png|200]]
|
|
|
|
Tanh: garantisce range (0, 1)
|
|
contro: tende a concentrare eccessivamente i valori verso il centro (0.5).
|
|
![[Pasted image 20241212094016.png|200]]
|
|
|