Skip to Main content Skip to Navigation
Theses

A random matrix framework for large dimensional machine learning and neural networks

Abstract : Large dimensional data and learning systems are ubiquitous in modern machine learning. As opposed to small dimensional learning, large dimensional machine learning algorithms are prone to various counterintuitive phenomena and behave strikingly differently from the low dimensional intuitions upon which they are built. Nonetheless, by assuming the data dimension and their number to be both large and comparable, random matrix theory (RMT) provides a systematic approach to assess the (statistical) behavior of these large learning systems, when applied on large dimensional data. The major objective of this thesis is to propose a full-fledged RMT-based framework for various machine learning systems: to assess their performance, to properly understand and to carefully refine them, so as to better handle large dimensional problems that are increasingly needed in artificial intelligence applications.Precisely, we exploit the close connection between kernel matrices, random feature maps, and single-hidden-layer random neural networks. Under a simple Gaussian mixture modeling for the input data, we provide a precise characterization of the performance of these large dimensional learning systems as a function of the data statistics, the dimensionality, and most importantly the hyperparameters (e.g., the choice of the kernel function or activation function) of the problem. Further addressing more involved learning algorithms, we extend the present RMT analysis framework to access large learning systems that are implicitly defined by convex optimization problems (e.g., logistic regression), when optimal points are assumed reachable. To find these optimal points, optimization methods such as gradient descent are regularly used. Aiming to have a better theoretical grasp of the inner mechanism of optimization methods and their impact on the resulting learning model, we further evaluate the gradient descent dynamics in training convex and non-convex objects.These preliminary studies provide a first quantitative understanding of the aforementioned learning algorithms when large dimensional data are processed, which further helps propose better design criteria for large learning systems that result in remarkable gains in performance when applied on real-world datasets. Deeply rooted in the idea of mining large dimensional data with repeated patterns at a global rather than a local level, the proposed RMT analysis framework allows for a renewed understanding and the possibility to control and improve a much larger range of machine learning approaches, and thereby opening the door to a renewed machine learning framework for artificial intelligence.
Complete list of metadatas

Cited literature [145 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02397287
Contributor : Abes Star :  Contact
Submitted on : Friday, December 6, 2019 - 2:46:08 PM
Last modification on : Friday, April 10, 2020 - 2:11:17 AM
Document(s) archivé(s) le : Saturday, March 7, 2020 - 3:12:49 PM

File

83028_LIAO_2019_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02397287, version 1

Citation

Zhenyu Liao. A random matrix framework for large dimensional machine learning and neural networks. Other. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLC068⟩. ⟨tel-02397287⟩

Share

Metrics

Record views

376

Files downloads

481