Skip to Main content Skip to Navigation

Modèles à variables latentes pour des données issues de tiling arrays. Applications aux expériences de ChIP-chip et de transcriptome.

Abstract : Tiling arrays make possible a large scale exploration of the genome with high resolution. Biological questions usually addressed are either the gene expression or the detection of transcribed regions which can be investigated via transcriptomic experiments, and also the regulation of gene expression thanks to ChIP-chip experiments. In order to analyse ChIP-chip and transcriptomic data, we propose latent variable models, especially Hidden Markov Models, which are part of unsupervised classi_cation methods. The biological features of the tiling arrays signal, such as the spatial dependence between observations along the genome and structural annotation are integrated in the model. Moreover, the models are adapted to the biological question at hand and a model is proposed for each type of experiment. We propose a mixture of regressions for the comparison of two samples, when one sample can be considered as a reference sample (ChIP-chip), and a two-dimensional Gaussian model with constraints on the variance parameter when the two samples play symmetrical roles (transcriptome). Finally, a semi-parametric modeling is considered, allowing more _exible emission distributions. With the objective of classi_cation, we propose a false-positive control in the case of a two-cluster classi_cation and for independent observations. Then, we focus on the classi_cation of a set of observations forming a region of interest such as a gene. The di_erent models are illustrated on real ChIP-chip and transcriptomic datasets coming from a NimbleGen tiling array covering the entire genome of Arabidopsis thaliana.
Keywords : these
Document type :
Complete list of metadata

Cited literature [212 references]  Display  Hide  Download
Contributor : Migration ProdInra Connect in order to contact the contributor
Submitted on : Saturday, June 6, 2020 - 3:07:21 AM
Last modification on : Friday, August 5, 2022 - 2:38:10 PM


Publisher files allowed on an open archive


  • HAL Id : tel-02807071, version 1
  • PRODINRA : 49277



Caroline Bérard. Modèles à variables latentes pour des données issues de tiling arrays. Applications aux expériences de ChIP-chip et de transcriptome.. Mathématiques [math]. Ecole Doctorale Agriculture Alimentation Biologie Environnement Santé, 2011. Français. ⟨tel-02807071⟩



Record views


Files downloads