Project : metiss
Section: Application Domains
The field of speaker characterisation and verification covers a variety of tasks that consist in using a speech signal to determine some information concerning the identity of the speaker who uttered it. Indeed, even though the voice characteristics of a person are not unique , many factors (morphological, physiological, psychological, sociological, ...) have an influence on a person's voice. The activities of the METISS group in this domain are mainly focused on speaker verification, i.e the task of accepting or rejecting an identity claim made by the user of a service with access control.
Speaker recognition and verification has made significant progress with the systematical use of probabilistic models, in particular Hidden Markov Models (for text-dependent applications) and Gaussian Mixture Models (for text-independent applications). As presented in the fundamentals of this report, the current state-of-the-art approaches rely on bayesian decision theory.
However, robustness issues are still pending : when speaker characteristics are learned on small quantities of data, the trained model has very poor performance, because it lacks generalisation capabilities. This problem can partly be overcome by adaptation techniques (following the MAP viewpoint), using either a speaker-independent model as general knowledge, or some structural information, for instance a dependency model between local distributions.
Another key issue in practice is the deviation of the models from the exact probability density functions, which requires a step of score normalisation before comparing the likelihood ratio to a decision threshold.
The specific areas on which the METISS project puts particular effort are these two robustness issues.