Skip to Main content Skip to Navigation
Theses

Forêts aléatoires et interprétabilité des algorithmes d’apprentissage

Abstract : This thesis deals with the interpretability of learning algorithms in an industrial context. Manufacturing production and the design of industrial systems are two examples where interpretability of learning methods enables to grasp how the inputs and outputs of a system are connected, and therefore to improve the system efficiency. Although there is no consensus on a precise definition of interpretability, it is possible to identify several requirements:“simplicity, stability, and accuracy”, rarely all satisfied by existing interpretable methods. The structure and stability of random forests make them good candidates to improve the performance of interpretable algorithms. The first part of this thesis is dedicated to post-hoc methods, in particular variable importance measures for random forests. The first convergence result of Breiman’s MDA is established, and shows that this measure is strongly biased using a sensitivity analysis perspective. The Sobol-MDA algorithm is introduced to fix the MDA flaws, replacing permutations by projections. An extension to Shapley effects, an efficient importance measure when input variables are dependent,is then proposed with the SHAFF algorithm. The second part of this thesis focuses on rule learning models, which are simple and highly predictive algorithms, but are also very often unstable with respect to small data perturbations. SIRUS algorithm is designed as the extraction of a compact rule ensemble from a random forest, and considerably improves stability over state-of-the-art competitors, while preserving simplicity and accuracy.
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03478241
Contributor : ABES STAR :  Contact
Submitted on : Tuesday, May 31, 2022 - 10:31:44 AM
Last modification on : Friday, August 5, 2022 - 3:00:08 PM

File

BENARD_Clement_these_2021.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03478241, version 2

Citation

Clément Bénard. Forêts aléatoires et interprétabilité des algorithmes d’apprentissage. Statistiques [math.ST]. Sorbonne Université, 2021. Français. ⟨NNT : 2021SORUS319⟩. ⟨tel-03478241v2⟩

Share

Metrics

Record views

223

Files downloads

200