# SELECTION DE VARIABLES POUR LA DISCRIMINATION EN GRANDE DIMENSION ET CLASSIFICATION DE DONNEES FONCTIONNELLES

Abstract : This thesis deals with non parametric statistics and is related to classification and discrimination in high dimension, and more particularly on variable selection. A first part is devoted to variable selection through CART, both on the regression and binary classification frameworks. The proposed exhaustive procedure is based on model selection which leads to oracle'' inequalities and allows to perform variable selection by penalized empirical contrast. A second part is motivated by an industrial problem. Il consists of determining among the temporal signals, measured during experiments, those able to explain the subjective drivability, and then to define the ranges responsible for this relevance. The adopted methodology is articulated around the preprocessing of the signals, dimensionality reduction by compression using a common wavelet basis and selection of useful variables involving CART and a strategy step by step. A last part deals with functional data classification with the k-nearest neighbors. The procedure consists of applying k-nearest neighbors on the coordinates of the projection of the data on a suitable chosen finite dimensional space. The procedure involves selecting simultaneously the space dimension and the number of neighbors. The traditional version of k-nearest neighbors and a slightly penalized version are theoretically considered. A study on real and simulated data shows that the introduction of a small penalty term stabilizes the selection while preserving good performance.
Keywords :
Document type :
Theses
Domain :
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-00012008
Contributor : Christine Tuleau <>
Submitted on : Wednesday, March 22, 2006 - 1:31:49 PM
Last modification on : Monday, October 19, 2020 - 11:01:17 AM
Long-term archiving on: : Saturday, April 3, 2010 - 9:05:34 PM

### Identifiers

• HAL Id : tel-00012008, version 1

### Citation

Christine Tuleau. SELECTION DE VARIABLES POUR LA DISCRIMINATION EN GRANDE DIMENSION ET CLASSIFICATION DE DONNEES FONCTIONNELLES. Mathématiques [math]. Université Paris Sud - Paris XI, 2005. Français. ⟨tel-00012008⟩

Record views