Développement de méthodes statistiques nécessaires à l'analyse de données génomiques : application à l'influence du polymorphisme génétique sur les caractéristiques cutanées individuelles et l'expression du vieillissement cutané

Abstract : New technologies developed recently in the field of genetic have generated high-dimensional databases, especially SNPs databases. These databases are often characterized by a number of variables much larger than the number of individuals. The goal of this dissertation was to develop appropriate statistical methods to analyse high-dimensional data, and to select the most biologically relevant variables. In the first part, I present the state of the art that describes unsupervised and supervised variables selection methods for two or more blocks of variables. In the second part, I present two new unsupervised "sparse" methods: Group Sparse Principal Component Analysis (GSPCA) and Sparse Multiple Correspondence Analysis (Sparse MCA). Considered as regression problems with a group LASSO penalization, these methods lead to select blocks of quantitative and qualitative variables, respectively. The third part is devoted to interactions between SNPs. A method employed to identify these interactions is presented: the logic regression. Finally, the last part presents an application of these methods on a real SNPs dataset to study the possible influence of genetic polymorphism on facial skin aging in adult women. The methods developed gave relevant results that confirmed the biologist's expectations and that offered new research perspectives.
Document type :
Theses
Complete list of metadatas

Cited literature [52 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00925074
Contributor : Abes Star <>
Submitted on : Friday, February 28, 2014 - 11:32:49 AM
Last modification on : Wednesday, March 20, 2019 - 4:52:02 PM
Long-term archiving on : Wednesday, May 28, 2014 - 11:25:46 AM

File

BERNARD_-_Anne_ThA_se.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00925074, version 2

Collections

Citation

Anne Bernard. Développement de méthodes statistiques nécessaires à l'analyse de données génomiques : application à l'influence du polymorphisme génétique sur les caractéristiques cutanées individuelles et l'expression du vieillissement cutané. Ordinateur et société [cs.CY]. Conservatoire national des arts et metiers - CNAM, 2013. Français. ⟨NNT : 2013CNAM0882⟩. ⟨tel-00925074v2⟩

Share

Metrics

Record views

853

Files downloads

374