## Section:
Scientific Foundations2>
### Mixture models3>

Participants : Angelika Studeny, Thomas Vincent, Christine Bakhous, Lotfi Chaari, Senan James Doyle, Jean-Baptiste Durand, Florence Forbes, Stéphane Girard, Marie-José Martinez, Darren Wraith.

mixture of distributions, EM algorithm, missing data, conditional independence, statistical pattern recognition, clustering, unsupervised and partially supervised learning

In a first approach, we consider statistical parametric models, being the parameter, possibly multi-dimensional, usually unknown and to be estimated. We consider cases where the data naturally divides into observed data and unobserved or missing data . The missing data represents for instance the memberships of one of a set of alternative categories. The distribution of an observed can be written as a finite mixture of distributions,

These models are interesting in that they may point out hidden
variable responsible for most of the observed variability and so
that the observed variables are *conditionally* independent.
Their estimation is often difficult due to the missing data. The
Expectation-Maximization (EM) algorithm is a general and now
standard approach to maximization of the likelihood in missing
data problems. It provides parameter estimation but also values
for missing data.

Mixture models correspond to independent 's. They are increasingly used in statistical pattern recognition. They enable a formal (model-based) approach to (unsupervised) clustering.