Skip to Main content Skip to Navigation

Sélection de modèle pour la classification non supervisée. Choix du nombre de classes.

Abstract : The reported works take place in the statistical framework of model-based clustering. We particularly focus on choosing the number of classes and on the ICL model selection criterion. A fruitful approach for theoretically studying it consists of considering a contrast related to the clustering purpose. This entails the definition and study of a new estimator and new model selection criteria. Practical solutions are provided to compute them, which can also be applied to the computation of the usual maximum likelihood estimator within mixture models. The slope heuristics is applied to the calibration of the considered penalized criteria. Thus its theoretical bases are recalled in details and two approaches for its application are studied. Another approach for model-based clustering is considered: each class itself may be modeled by a Gaussian mixture. A methodology is proposed, notably to tackle the question of which components have to be merged. Finally a criterion is proposed, which enables to choose a number of components --when identified to the number of classes-- related to a known external classification.
Document type :
Complete list of metadata
Contributor : Jean-Patrick Baudry Connect in order to contact the contributor
Submitted on : Thursday, March 4, 2010 - 6:48:09 PM
Last modification on : Wednesday, September 16, 2020 - 5:06:34 PM
Long-term archiving on: : Friday, June 18, 2010 - 10:19:15 PM


  • HAL Id : tel-00461550, version 1



Jean-Patrick Baudry. Sélection de modèle pour la classification non supervisée. Choix du nombre de classes.. Mathématiques [math]. Université Paris Sud - Paris XI, 2009. Français. ⟨tel-00461550⟩



Record views


Files downloads