The SpaCEM3 program
Participants : Lamiae Azizi, Sophie Chopart, Senan James Doyle, Florence Forbes.
SpaCEM3 (Spatial Clustering with EM and Markov Models) is a software that provides a wide range of supervised or unsupervised clustering algorithms. The main originality of the proposed algorithms is that clustered objects do not need to be assumed independent and can be associated with very high-dimensional measurements. Typical examples include image segmentation where the objects are the pixels on a regular grid and depend on neighbouring pixels on this grid. More generally, the software provides algorithms to cluster multimodal data with an underlying dependence structure accounting for some spatial localisation or some kind of interaction that can be encoded in a graph.
This software, developed by present and past members of the team, is the result of several research developments on the subject. The current version 2.09 of the software is CeCILLB licensed.
Main features. The approach is based on the EM algorithm for clustering and on Markov Random Fields (MRF) to account for dependencies. In addition to standard clustering tools based on independent gaussian mixture models, SpaCEM3 features include:
The unsupervised clustering of dependent objects. Their dependencies is encoded via a graph not necessarily regular and data sets are modelled via Markov random fields and mixture models (eg. MRF and Hidden MRF). Available Markov models include extensions of the Potts model with the possibility to define more general interaction models.
The supervised clustering of dependent objects when standard Hidden MRF (HMRF) assumptions do not hold (ie. in the case of non correlated and non unimodal noise models). The learning and test steps are based on recently introduced Triplet Markov models.
Selection model criteria (BIC, ICL and their mean-field approximations) that select the "best" HMRF according to the data.
The possibility of producing simulated data from:
A specific setting to account for high-dimensional observations.
An integrated framework to deal with missing observations, under Missing At Random (MAR) hypothesis, with prior imputation (KNN, mean, ...), online imputation (as a step in the algorithm) or without imputation.
The software is available at http://spacem3.gforge.inria.fr . A paper describing the software and its main functionalities has been recently published in French  . A user manual in English is available on the web site above together with example data sets.