Section: New Results
Clustering and Optimal Segmentation of Curves
Participant : Yves Lechevallier.
Functional Data Analysis is an extension of traditional data analysis to functional data. In this framework, each individual is described by one or several functions, rather than by a vector of Rn . This approach allows to take into account the regularity of the observed functions.
In 2010, we have continued our work on exploratory analysis algorithm for functional data in collaboration with F. Rossi, G. Hebrail from Telecom Paris Tech. Our method [21] , [31] partitions a set of functions into K clusters and represents each cluster by a simple prototype (e.g., piecewise constant). The total number of segments in the prototypes, P , is chosen by the user and optimally distributed among the clusters via two dynamic programming algorithms. The main idea is to provide the analyst with a summary of the set with a manageable complexity.
We propose to merge the two approaches: we build a K-means like clustering of a set of functions in which each prototype is given by a simple function defined in a piecewise way. The input interval of each prototype is partitioned into sub-intervals on which the prototype assumes a simple form. Using dynamic programming, we obtain an optimal segmentation for each prototype while the number of segments used in each cluster is also optimally chosen with respect to a user specified total number segments. In the case of piecewise constant prototypes, a set of functions is summarized via 2P-K real values, where K is the number of prototypes and P the total number of segments used to represent the prototypes.