Skip to Main content Skip to Navigation
Theses

Modèles à variables latentes pour des données issues de tiling arrays. Applications aux expériences de ChIP-chip et de transcriptome.

Abstract : Tiling arrays make possible a large scale exploration of the genome with high resolution. Biological questions usually addressed are either the gene expression or the detection of transcribed regions which can be investigated via transcriptomic experiments, and also the regulation of gene expression thanks to ChIP-chip experiments. In order to analyse ChIP-chip and transcriptomic data, we propose latent variable models, especially Hidden Markov Models, which are part of unsupervised classi_cation methods. The biological features of the tiling arrays signal, such as the spatial dependence between observations along the genome and structural annotation are integrated in the model. Moreover, the models are adapted to the biological question at hand and a model is proposed for each type of experiment. We propose a mixture of regressions for the comparison of two samples, when one sample can be considered as a reference sample (ChIP-chip), and a two-dimensional Gaussian model with constraints on the variance parameter when the two samples play symmetrical roles (transcriptome). Finally, a semi-parametric modeling is considered, allowing more _exible emission distributions. With the objective of classi_cation, we propose a false-positive control in the case of a two-cluster classi_cation and for independent observations. Then, we focus on the classi_cation of a set of observations forming a region of interest such as a gene. The di_erent models are illustrated on real ChIP-chip and transcriptomic datasets coming from a NimbleGen tiling array covering the entire genome of Arabidopsis thaliana.
Keywords : these
Document type :
Theses
Complete list of metadata

Cited literature [212 references]  Display  Hide  Download

https://hal.inrae.fr/tel-02807071
Contributor : Migration ProdInra Connect in order to contact the contributor
Submitted on : Saturday, June 6, 2020 - 3:07:21 AM
Last modification on : Friday, August 5, 2022 - 2:38:10 PM

File

49277_20120208110735615_1.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : tel-02807071, version 1
  • PRODINRA : 49277

Collections

Citation

Caroline Bérard. Modèles à variables latentes pour des données issues de tiling arrays. Applications aux expériences de ChIP-chip et de transcriptome.. Mathématiques [math]. Ecole Doctorale Agriculture Alimentation Biologie Environnement Santé, 2011. Français. ⟨tel-02807071⟩

Share

Metrics

Record views

29

Files downloads

56