Skip to Main content Skip to Navigation

Rééchantillonnage et Sélection de modèles

Sylvain Arlot 1, 2
2 SELECT - Model selection in statistical learning
LMO - Laboratoire de Mathématiques d'Orsay, Inria Saclay - Ile de France
Abstract : This thesis takes place within the theories of non-parametric statistics and statistical learning. Its goal is to provide an accurate understanding of several resampling or model selection methods, from the non-asymptotic viewpoint.

The main advance in this thesis consists in the accurate calibration of model selection procedures, in order to make them optimal in practice for prediction. We study V-fold cross-validation (very commonly used, but badly known in theory, in particular for the question of choosing V) and several penalization procedures. We propose methods for calibrating accurately some penalties, for both their general shape and the multiplicative constants. The use of resampling allows to solve hard problems, in particular regression with a variable noise-level. We prove non-asymptotic theoretical results on these methods, such as oracle inequalities and adaptivity properties. These results rely in particular on some concentration inequalities.

We also consider the problem of confidence regions and multiple testing, when the data are high-dimensional, with general and unknown correlations. Using resampling methods, we can get rid of the curse of dimensionality, and "learn" these correlations. We mainly propose two procedures, and prove for both a non-asymptotic control of their level.
Document type :
Complete list of metadatas
Contributor : Sylvain Arlot <>
Submitted on : Monday, December 17, 2007 - 11:10:00 PM
Last modification on : Friday, November 27, 2020 - 5:50:03 PM
Long-term archiving on: : Monday, April 12, 2010 - 8:15:52 AM


  • HAL Id : tel-00198803, version 1



Sylvain Arlot. Rééchantillonnage et Sélection de modèles. Mathématiques [math]. Université Paris Sud - Paris XI, 2007. Français. ⟨tel-00198803⟩



Record views


Files downloads