Skip to Main content Skip to Navigation
Theses

Réduction de la dimension et sélection de modèles en classification supervisée

Abstract : This thesis takes place within the framework of statistical learning. We study the supervised classification problem for large dimension data. The first part of the document presents the state of the art regarding model selection in the Vapnik theory framework and different variable selection approaches in statistical learning. The second part presents some new tools for model selection and dimension reduction. In Chapter 3, we provide an estimator of the bias between the conditional and the training error of a classification rule. This estimator is then used to derive a penalized criterion called Swapping. The penalty is based on the points for which a change of label induces a change of prediction. An application to the choice of k in the kNN algorithm is presented. In Chapter 4, we propose a penalized criterion for variable selection in supervised classi¯cation. This criterion provides a theoretical justi¯cation of the pruning step of the CART algorithm as an embedded variable selection method. The last chapter is dedicated to a general strategy to aggregate variables in view of classification. The reduction dimension method is adapted to the classification algorithm. Applications to functional and microarray data are provided.
Document type :
Theses
Complete list of metadatas

Cited literature [171 references]  Display  Hide  Download

https://hal.inrae.fr/tel-02824788
Contributor : Migration Prodinra <>
Submitted on : Saturday, June 6, 2020 - 11:01:36 PM
Last modification on : Friday, June 12, 2020 - 10:43:26 AM

File

51647_20120323031844512_1.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : tel-02824788, version 1
  • PRODINRA : 51647

Collections

Citation

Tristan Mary-Huard. Réduction de la dimension et sélection de modèles en classification supervisée. Mathématiques [math]. Université Paris Sud - Paris 11, 2006. Français. ⟨tel-02824788⟩

Share

Metrics

Record views

37

Files downloads

32