Skip to Main content Skip to Navigation

Methods of random matrices for large dimensional statistical learning

Abstract : The BigData challenge induces a need for machine learning algorithms to evolve towards large dimensional and more efficient learning engines. Recently, a new direction of research has emerged that consists in analyzing learning methods in the modern regime where the number n and the dimension p of data samples are commensurately large. Compared to the conventional regime where n>>p, the regime with large and comparable n,p is particularly interesting as the learning performance in this regime remains sensitive to the tuning of hyperparameters, thus opening a path into the understanding and improvement of learning techniques for large dimensional datasets.The technical approach employed in this thesis draws on several advanced tools of high dimensional statistics, allowing us to conduct more elaborate analyses beyond the state of the art. The first part of this dissertation is devoted to the study of semi-supervised learning on high dimensional data. Motivated by our theoretical findings, we propose a superior alternative to the standard semi-supervised method of Laplacian regularization. The methods involving implicit optimizations, such as SVMs and logistic regression, are next investigated under realistic mixture models, providing exhaustive details on the learning mechanism. Several important consequences are thus revealed, some of which are even in contradiction with common belief.
Document type :
Complete list of metadata

Cited literature [98 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Wednesday, December 18, 2019 - 4:25:11 PM
Last modification on : Saturday, June 25, 2022 - 10:43:29 PM
Long-term archiving on: : Thursday, March 19, 2020 - 9:39:01 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02418282, version 1


Xiaoyi Mai. Methods of random matrices for large dimensional statistical learning. Automatic. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLC078⟩. ⟨tel-02418282⟩



Record views


Files downloads