Contributions to variable selection, clustering and statistical estimation inhigh dimension

Abstract : This PhD thesis deals with the following statistical problems: Variable selection in high-Dimensional Linear Regression, Clustering in the Gaussian Mixture Model, Some effects of adaptivity under sparsity and Simulation of Gaussian processes.Under the sparsity assumption, variable selection corresponds to recovering the "small" set of significant variables. We study non-asymptotic properties of this problem in the high-dimensional linear regression. Moreover, we recover optimal necessary and sufficient conditions for variable selection in this model. We also study some effects of adaptation under sparsity. Namely, in the sparse vector model, we investigate, the changes in the estimation rates of some of the model parameters when the noise level or its nominal law are unknown.Clustering is a non-supervised machine learning task aiming to group observations that are close to each other in some sense. We study the problem of community detection in the Gaussian Mixture Model with two components, and characterize precisely the sharp separation between clusters in order to recover exactly the clusters. We also provide a fast polynomial time procedure achieving optimal recovery.Gaussian processes are extremely useful in practice, when it comes to model price fluctuations for instance. Nevertheless, their simulation is not easy in general. We propose and study a new rate-optimal series expansion to simulate a large class of Gaussian processes.
Document type :
Theses
Complete list of metadatas

Cited literature [171 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-02266365
Contributor : Abes Star <>
Submitted on : Wednesday, August 14, 2019 - 9:03:06 AM
Last modification on : Friday, August 16, 2019 - 1:09:42 AM

File

80528_NDAOUD_2019_archivage.pd...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02266365, version 1

Citation

Mohamed Ndaoud. Contributions to variable selection, clustering and statistical estimation inhigh dimension. Statistics [math.ST]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLG005⟩. ⟨tel-02266365⟩

Share

Metrics

Record views

361

Files downloads

34