Skip to Main content Skip to Navigation
Theses

Méthodes Statistiques pour l'Analyse de Données Génétiques d'Association à Grande Echelle

Abstract : The increasing availability of dense Single Nucleotide Polymorphisms (SNPs) maps due to rapid improvements in Molecular Biology and genotyping technologies have recently led geneticists towards genome-wide association studies with hopes of encouraging results concerning our understanding of the genetic basis of complex diseases. The analysis of such high-throughput data implies today new statistical and computational problematic to face, which constitute the main topic of this thesis.
After a brief description of the main questions raised by genome-wide association studies, we deal with single-marker approaches by a power study of the main association tests and their combination. We consider then the use of multi-markers approaches by focusing on the method we developed which relies on the Local Score. This sum statistic identifies associations between regions and the disease instead of marker considered individually. It represents a simple, fast and flexible method for which we assess the efficiency based on simulated and real genome-wide association data. Finally, this thesis also deals with the multiple-testing problem attached to the number of independent tests performed for the analysis of high-throughput data. Our Local Score-based approach circumvents this problem by reducing the number of tests. In parallel, we present an estimation of the Local False Discovery Rate by a simple Gaussian mixed model.
The methods described in this manuscript are implemented in three softwares available on the website of the Statistique et Génome laboratory: fueatest, LhiSA and kerfdr.
Document type :
Theses
Complete list of metadata

Cited literature [94 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00169411
Contributor : Mickael Guedj <>
Submitted on : Monday, September 3, 2007 - 5:06:05 PM
Last modification on : Saturday, June 6, 2020 - 9:39:43 PM
Long-term archiving on: : Monday, September 24, 2012 - 12:00:10 PM

File

Identifiers

  • HAL Id : tel-00169411, version 1
  • PRODINRA : 251431

Collections

Citation

Mickael Guedj. Méthodes Statistiques pour l'Analyse de Données Génétiques d'Association à Grande Echelle. Sciences du Vivant [q-bio]. Université d'Evry-Val d'Essonne, 2007. Français. ⟨tel-00169411⟩

Share

Metrics

Record views

1411

Files downloads

519