Skip to Main content Skip to Navigation
Theses

Methods of random matrices for large dimensional statistical learning

Abstract : The BigData challenge induces a need for machine learning algorithms to evolve towards large dimensional and more efficient learning engines. Recently, a new direction of research has emerged that consists in analyzing learning methods in the modern regime where the number n and the dimension p of data samples are commensurately large. Compared to the conventional regime where n>>p, the regime with large and comparable n,p is particularly interesting as the learning performance in this regime remains sensitive to the tuning of hyperparameters, thus opening a path into the understanding and improvement of learning techniques for large dimensional datasets.The technical approach employed in this thesis draws on several advanced tools of high dimensional statistics, allowing us to conduct more elaborate analyses beyond the state of the art. The first part of this dissertation is devoted to the study of semi-supervised learning on high dimensional data. Motivated by our theoretical findings, we propose a superior alternative to the standard semi-supervised method of Laplacian regularization. The methods involving implicit optimizations, such as SVMs and logistic regression, are next investigated under realistic mixture models, providing exhaustive details on the learning mechanism. Several important consequences are thus revealed, some of which are even in contradiction with common belief.
Complete list of metadatas

Cited literature [98 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02418282
Contributor : Abes Star :  Contact
Submitted on : Wednesday, December 18, 2019 - 4:25:11 PM
Last modification on : Wednesday, April 8, 2020 - 3:53:44 PM
Document(s) archivé(s) le : Thursday, March 19, 2020 - 9:39:01 PM

File

80983_MAI_2019_diffusion.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02418282, version 1

Citation

Xiaoyi Mai. Methods of random matrices for large dimensional statistical learning. Automatic. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLC078⟩. ⟨tel-02418282⟩

Share

Metrics

Record views

95

Files downloads

243