Abstract : We address the issue of clustering individuals from " complex " observations in the sense that they do not verify some of the classically adopted simplifying assumptions. In this work, the individuals to be clustered are assumed to be dependant upon one another. We adopt a probabilistic approach based on Markovian models. Three clustering problems are considered.
The first of these relates to high-dimensional data clustering. For such a problem, we adopt a non-diagonal Gaussian Markovian model which is based upon the fact that most high-dimensional data actually lives in class dependent subspaces of lower dimension. Such a model only requires the estimation of a reasonable number of parameters.
The second point attempts go beyond the simplifying assumption of unimodal, and in particular Gaussian, independent noise. We consider for this the recent triplet Markov field model and propose a new family of triplet Markov field models adapted to the framework of a supervised classification. We illustrate the flexibility and performances of our models, applied through real texture image recognition.
Finally, we tackle the problem of clustering with incomplete observations, i.e. for which some values are missing. For this we develop a Markovian method which does not require preliminary imputation of the missing data. We present an application of this methodology on a real gene clustering issue.