Contributions to the statistical analysis of microarray data.

Abstract : This thesis deals with statistical questions raised by the analysis of high-dimensional genomic data for cancer research. In the first part, we study asymptotic properties of multiple testing procedures that aim at controlling the False Discovery Rate (FDR), that is, the expected False Discovery Proportion (FDP) among rejected hypotheses. We develop a versatile formalism to calculate the asymptotic distribution of the FDP an the associated regularity conditions, for a wide range of multiple testing procedures, and compare their asymptotic power. We then study in terms of FDR control connections between intrinsic bounds between three multiple testing problems: detection, estimation and selection. In particular, we connect convergence rates in the estimation problem to the regularity of the p-value distribution near 1. In the second part, we develop statistical methods to study DNA microarrays for cancer research. We propose a microarray normalization method that removes spatial biases while preserving the true biological signal; it combines robust regression with a mixture model with spatial constraints. Then we develop a method to infer gene regulations from gene expression data, which is based on learning and multiple testing theories. Finally, we build a genomic score to predict, for a patient treated for a breast tumor, whether or not a second cancer is a true recurrence of the first cancer.
Complete list of metadatas

Cited literature [2 references]  Display  Hide  Download
Contributor : Pierre Neuvial <>
Submitted on : Wednesday, November 18, 2009 - 8:07:44 AM
Last modification on : Wednesday, May 15, 2019 - 3:39:21 AM
Long-term archiving on : Tuesday, October 16, 2012 - 2:20:57 PM


  • HAL Id : tel-00433045, version 1


Pierre Neuvial. Contributions to the statistical analysis of microarray data.. Life Sciences [q-bio]. Université Paris-Diderot - Paris VII, 2009. English. ⟨tel-00433045⟩



Record views


Files downloads