master-degree-notes/Biometric Systems/notes/13. Multi biometric.md

7.3 KiB

The idea is to complement the weaknesses of a system with the strengths of another.

Examples:

  • multiple biometric traits (e.g. signature + fingerprint, used in India, USA ecc.)
    • most obvious meaning of multi biometrics
  • multiple instances: same trait but acquired in different nuances (i.e. 2 or more different fingers, both irises, both ears, multiple instances of hand geometry...)
  • repeated instances: same trait, same element, but acquired multiple times
  • multiple algorithms: same trait, same element but using multiple classifiers
    • exploits strengths and weaknesses
  • multiple sensors: i.e. fingerprint with both optical and capacitive sensor

Where do the fusion happen? It can happen

  • at sensor level:
    • not always feasible
  • at feature level: fusing feature vectors before matching
    • not always feasible: feature vectors should be comparable in nature and size
    • an example is when we have multiple samples of the same traits, in this case they will be certainly comparable
  • score level fusion: or match level fusion. Consists in fusing the scores (probability scores) or rankings
    • most feasible solution
    • each system works by itself
    • scores need to be comparable: normalization in a common range may be required
  • decision level fusion: separate decisions (look at slide)

!Pasted image 20241212084256.png

Feature level fusion

!Pasted image 20241212084349.png

Better results are expected, since much more information is still present Possible problems:

  • incompatible feature set
  • feature vector combination may cause "curse of dimensionality"
  • a more complex matcher may be required
  • combined vectors may include noisy or redundant data.
Feature level fusion: serial

example: use SIFT (scalar invariant feature transform) Phases:

  • feature extraction (SIFT feature set)
  • feature normalization: required due to the possible significant differences in the scale of the vector values
  • si crea un vettore solo composto dai due feature vector

Problems to address:

  • feature selection / reduction
    • è più efficiente scegliere poche feature rispetto all'intero vettore, si possono usare tecniche come
      • clutering k-means mantenendo solo i centri dei cluster
        • performed after linking the two normalized vectors
      • neighborhood elimination
        • points at a certain distance are eliminated
        • performed before linking, on the single vectors
      • points belonging to specific regions
        • only points in specific regions of the train (e.g. face, nose, mouth...) are maintained
  • matching
    • point pattern matching
      • method to find the number of paired "points" between the probe vector and the gallery one
      • two points are paired if their distance is smaller than a threshold
Feature level fusion: parallel

parallel combination of the two vectors:

  • vector normalization
    • shorter vector is extended to match the size of the other one
    • e.g. zero-padding
  • **pre-processing of vectors
    • step 1: transform vectors in unitary vectors (dividing them by their L2 norm)
    • step 2: weighted combination through the coefficient \theta, based on the lenght of X and Y
    • we can then use X as the real part and Y as the imaginary part of the final vector
  • further feature processing:
    • using linear techniques like PCA, L-L expansion, LDA
Feature level fusion: CCA

The idea is to find a pair of transformations that maximizes the correlation between characteristics

Score level fusion

!Pasted image 20241212085003.png

Transformation based: scores from different matchers are first normalized in a common domain and then combined using fusion rules

Classifier based: the scores are considered as features and included into a feature vector. A further classifier is trained (can be SVM, decision tree, neural netework...)

Fusion rules

Abstract: each classifier outputs a class label Majority vote: each classifier votes for a class

Rank: each classifier outputs its class rank Borda count:

  • each classifier produces a ranking (classifica) according to the probability of the pattern belonging to them
  • ranking are converted in scores and summed up
  • the class with the highest final score is the one chosen by the multi-classifier

es. su 4 posti disponibili, la classe più probabile ha rank 4, quella meno probabile rank 1. I rank di ogni classificatore si sommano. Can also be used in identification open set, using a threshold to discard low scores (score is the sum of ranks)

Measurement: each classifier outputs its classification score !Pasted image 20241212090608.png

Different methods are possible (i.e. sum, weighted sum, mean, product, weighted product, max, min, ecc.)

  • sum: the sum of the returned confidence vectors is computed, pattern is classified according to the highest value

Scores from different matchers are typically unhomogeneous:

  • different range
  • similarity vs distance
  • different distributions

Normalization is required! But there are issues to consider when choosing a normalization method:

  • robustness: the transformation should not be influenced by outliers
  • effectiveness: estimated parameters for the score distribution should be best approximate the real values
Reliability

A reliability measure for each single response of each subsystem before fusing them in a final response. Confidence margins being a possible solution. Poh e Bengio: solution based on FAR and FRR M(\nabla) = |FAR(\nabla)-FR{R}(\nabla)|

Decision level fusion

!Pasted image 20241212091320.png

A common way is majority voting. But also serial combination (AND) or parallel combination (OR) can be used. Be careful when using OR: if a single classifier says ok but the other fails, it is accepted (less secure)!

Template updating - Co-Update method

mi sono distratto, integrare con slide

Data normalization

When minimum and maximum values are known, normalization is trivial. For this reason, we assumed to miss an exact estimate of the maximum value. We chose the average value in its place, in order to stress normalization functions even more.

Normalization functions:

  • min/max
    • s'_{k}=\frac{s_{k}-min}{max-min}
  • z-score

  • median/mad

  • sigmoid

  • tanh

!Pasted image 20241212094046.png

The Min-max normalization technique performs a “mapping” (shifting + compression/dilation) of the interval between the minimum and maximum values in the interval between 0 and 1 Pro: range tra 0 e 1 Contro: bisogna conoscere minimo e massimo dello score di ogni sottosistema !Pasted image 20241212093902.png

Standardizzazione per media e varianza, ampiamente usato contro: non porta lo score in un range fisso !Pasted image 20241212093927.png

median/MAD: si sottrae la mediana e si divide per la mediana dei valori assoluti funziona male se la distribuzione degli score non è gaussiana. Non preserva la distribuzione originale e non garantisce nemmeno un range fisso :/ !Pasted image 20241212093943.png

Sigmoide: porta nell'intervallo aperto (0, 1) contro 1: verso gli estremi distorce parecchio contro 2: dipende dai parametri k e c che dipendono a sua volta dalla distribuzione degli score !Pasted image 20241212094000.png

Tanh: garantisce range (0, 1) contro: tende a concentrare eccessivamente i valori verso il centro (0.5). !Pasted image 20241212094016.png