## Section: Scientific Foundations

### Mixture models

Participants : Lamiae Azizi, Christine Bakhous, Lotfi Chaari, Senan James Doyle, Jean-Baptiste Durand, Florence Forbes, StÃ©phane Girard, Vasil Khalidov, Marie-JosÃ© Martinez, Darren Wraith.

In a first approach, we consider statistical parametric models,
being the parameter, possibly multi-dimensional, usually
unknown and to be estimated. We consider cases
where the data naturally divides into observed data
y = y_{1}, ..., y_{n} and unobserved or missing data
z = z_{1}, ..., z_{n} . The missing data z_{i} represents for instance the
memberships of one of a set of K alternative categories. The
distribution of an observed y_{i} can be written as a finite
mixture of distributions,

These models are interesting in that they may point out hidden
variable responsible for most of the observed variability and so
that the observed variables are *conditionally* independent.
Their estimation is often difficult due to the missing data. The
Expectation-Maximization (EM) algorithm is a general and now
standard approach to maximization of the likelihood in missing
data problems. It provides parameter estimation but also values
for missing data.

Mixture models correspond to independent z_{i} 's. They are increasingly used
in statistical pattern recognition. They enable a formal (model-based)
approach to (unsupervised) clustering.