Skip to Main content Skip to Navigation

Réduction de la dimension et sélection de modèles en classification supervisée

Abstract : This thesis takes place within the framework of statistical learning. We study the supervised classification problem for large dimension data. The first part of the document presents the state of the art regarding model selection in the Vapnik theory framework and different variable selection approaches in statistical learning. The second part presents some new tools for model selection and dimension reduction. In Chapter 3, we provide an estimator of the bias between the conditional and the training error of a classification rule. This estimator is then used to derive a penalized criterion called Swapping. The penalty is based on the points for which a change of label induces a change of prediction. An application to the choice of k in the kNN algorithm is presented. In Chapter 4, we propose a penalized criterion for variable selection in supervised classi¯cation. This criterion provides a theoretical justi¯cation of the pruning step of the CART algorithm as an embedded variable selection method. The last chapter is dedicated to a general strategy to aggregate variables in view of classification. The reduction dimension method is adapted to the classification algorithm. Applications to functional and microarray data are provided.
Document type :
Complete list of metadata

Cited literature [171 references]  Display  Hide  Download
Contributor : Migration ProdInra Connect in order to contact the contributor
Submitted on : Saturday, June 6, 2020 - 11:01:36 PM
Last modification on : Saturday, June 25, 2022 - 9:14:51 PM


Publisher files allowed on an open archive


  • HAL Id : tel-02824788, version 1
  • PRODINRA : 51647



Tristan Mary-Huard. Réduction de la dimension et sélection de modèles en classification supervisée. Mathématiques [math]. Université Paris Sud - Paris 11, 2006. Français. ⟨tel-02824788⟩



Record views


Files downloads