Learning with random forests

Abstract : This is devoted to a nonparametric estimation method called random forests, introduced by Breiman in 2001. Extensively used in a variety of areas, random forests exhibit good empirical performance and can handle massive data sets. However, the mathematical forces driving the algorithm remain largely unknown. After reviewing theoretical literature, we focus on the link between infinite forests (theoretically analyzed) and finite forests (used in practice) aiming at narrowing the gap between theory and practice. In particular, we propose a way to select the number of trees such that the errors of finite and infinite forests are similar. On the other hand, we study quantile forests, a type of algorithms close in spirit to Breiman's forests. In this context, we prove the benefit of trees aggregation: while each tree of quantile forest is not consistent, with a proper subsampling step, the forest is. Next, we show the connection between forests and some particular kernel estimates, which can be made explicit in some cases. We also establish upper bounds on the rate of convergence for these kernel estimates. Then we demonstrate two theorems on the consistency of both pruned and unpruned Breiman forests. We stress the importance of subsampling to demonstrate the consistency of the unpruned Breiman's forests. At last, we present the results of a Dreamchallenge whose goal was to predict the toxicity of several compounds for several patients based on their genetic profile.
Document type :
Theses
Complete list of metadatas

Cited literature [120 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01250221
Contributor : Abes Star <>
Submitted on : Friday, May 20, 2016 - 10:40:55 AM
Last modification on : Friday, March 22, 2019 - 1:30:09 AM

File

2015PA066533.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01250221, version 2

Citation

Erwan Scornet. Learning with random forests. Statistics [math.ST]. Université Pierre et Marie Curie - Paris VI, 2015. English. ⟨NNT : 2015PA066533⟩. ⟨tel-01250221v2⟩

Share

Metrics

Record views

782

Files downloads

1431