Project : metiss
Section: Application Domains
Keywords : speaker recognition, user authentication, voice signature.
The field of speaker characterisation and verification covers a variety of tasks that consist in using a speech signal to determine some information concerning the identity of the speaker who uttered it. Indeed, even though the voice characteristics of a person are not unique , many factors (morphological, physiological, psychological, sociological, ...) have an influence on a person's voice. One focus of the METISS group in this domain is speaker verification, i.e the task of accepting or rejecting an identity claim made by the user of a service with access control. We also dedicate some effort to the more general problem of speaker characterisation with two intentions : speaker indexation in the context of information retrieval and speaker selection in the context of speaker recognition.
Speaker recognition and verification has made significant progress with the systematical use of probabilistic models, in particular Hidden Markov Models (for text-dependent applications) and Gaussian Mixture Models (for text-independent applications). As presented in the fundamentals of this report, the current state-of-the-art approaches rely on bayesian decision theory.
However, robustness issues are still pending : when speaker characteristics are learned on small quantities of data, the trained model has very poor performance, because it lacks generalisation capabilities. This problem can partly be overcome by adaptation techniques (following the MAP viewpoint), using either a speaker-independent model as general knowledge, or some structural information, for instance a dependency model between local distributions.
Another key issue in practice is the non-controlable deviation of the models from the exact probability density functions, which requires a step of normalisation before comparing the verification score to a decision threshold.
In the context of speaker verification, the METISS project puts particular effort on these robustness issues. Algorithmic approaches are also developed to contribute to the scalability, the complexity reduction and the distribution of processes so as to specifically address needs related to the implementation of this technology on personal devices.
Various other topics of speaker characterisation are linked to speaker recognition and verification, in particular speaker elicitation, i.e. how to select a representative subset of speakers from a larger population and speaker representation, namely how to represent a new speaker in reference to a given speaker population.