Skip to Main content Skip to Navigation

Forêts aléatoires : aspects théoriques, sélection de variables et applications

Abstract : This thesis deals with statistical learning and is dedicated to the random forests method, which has been proposed by Breiman in 2001. Random forests are a non-parametric statistical method, which is very powerful in many applications, for regression problems as well as for supervised classification ones. They also succeed to handle very high dimensional data, where the number of variables largely exceeds the number of observations. In a first part, we develop a variable selection procedure, based on the variable importance index computed by random forests. This importance index allows to highlight relevant variables from useless ones. The proposed procedure consists to automatically select a variables set for interpretation or prediction purpose. The second part shows the ability of the variable selection procedure to deal with very different problems. The first application is a classification problem in very high dimension for neuroimaging data, while the second one covers genomic data which constitute a regression problem in smaller dimension. A last theoretical part, establishes some risk bounds for a simplified version of random forests. In the context of regression problems with a one-dimensional predictor space, we prove that both tree and forest estimators achieved the minimax rate of convergence. In addition we prove that forests improve accuracy by reducing the estimator variance by a factor of three fourths.
Document type :
Complete list of metadatas
Contributor : Robin Genuer <>
Submitted on : Saturday, January 1, 2011 - 3:13:16 PM
Last modification on : Wednesday, October 14, 2020 - 4:00:41 AM
Long-term archiving on: : Monday, November 5, 2012 - 3:05:22 PM


  • HAL Id : tel-00550989, version 1



Robin Genuer. Forêts aléatoires : aspects théoriques, sélection de variables et applications. Mathématiques [math]. Université Paris Sud - Paris XI, 2010. Français. ⟨tel-00550989⟩



Record views


Files downloads