Skip to Main content Skip to Navigation

Identification de déterminants génomiques impliqués dans la spécificité de fixation des facteurs de transcription.

Abstract : In this thesis, we are interested in the genomic determinants that can explain the binding differences of a particular transcription factor (TF) between two cell types. Transcription factors recognise and bind to particular subsequences, the collection of potential subsequences is modelised in binding motifs. However, the binding motif of a TF does not fully explain the binding. Indeed, the TF is not necessarily bound as soon as it recognises its binding motif and does not bind to the same loci depending on the cell type. The aim of this work is therefore to study other information in order to better understand TF binding in different cell types. This problem is studied in a supervised classification framework, where examples are genomic sequences and the two classes correspond to the cell types where the sequence is bound by the TF of interest. Sequences are described by three kinds of genomic features that are extracted from raw sequences by three dedicated methods: the nucleotide specificity of the binding site, the nucleotide content around the binding site, and the presence and position of potential binding sites of other cooperative transcription factors. All these features are used in a logistic regression model trained with penalized likelihood on different classification problems associating one TF in two different tissues. In each experiment, the model is used to identify the regulatory elements that are the most important for cell type differences. Our experiments show that it is possible to distinguish cell specific binding sites on the basis of the sequence only. Moreover a global analysis of the results show that the relative importance of the three kind of information strongly depends on TF and cell types.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Friday, April 8, 2022 - 6:21:56 PM
Last modification on : Friday, August 5, 2022 - 10:51:51 AM
Long-term archiving on: : Saturday, July 9, 2022 - 7:19:15 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03635893, version 1


Raphaël Romero. Identification de déterminants génomiques impliqués dans la spécificité de fixation des facteurs de transcription.. Génomique, Transcriptomique et Protéomique [q-bio.GN]. Université Montpellier, 2021. Français. ⟨NNT : 2021MONTS119⟩. ⟨tel-03635893⟩



Record views


Files downloads