Skip to Main content Skip to Navigation

Towards a genome-scale coevolutionary analysis

Abstract : Advances in sequencing technologies have revolutionized the life sciences. The explosion of genomic sequence data has prompted the development of a wide variety of methods, at the interface between bioinformatics, machine learning, and physics, which aim at gaining a deeper understanding of biological systems from such data.Pairwise coevolutionary methods, in particular Direct Coupling Analysis (DCA), can extract a multitude of information from sequence data alone, such as structural contacts or phenotypic effects of amino-acid substitutions in proteins. While they have been mainly applied to a number of single exemplary proteins, it is now time for a broader application at the level of the whole genome. In this thesis, we build upon and extend these models to address biological questions at the genome scale. In a first project, we investigate the protein-protein interaction network by combining coevolutionary signals at multiple but interconnected scales. In a subsequent project, we discuss the possibility of including complementary information to sequences, such as typical patterns of contacts, to improve the inter-protein contact prediction. Finally, through an extensive genome-wide study of E. coli strains, we show how the machinery of DCA can be used to investigate the fitness landscape properties at the local and global scales.
Complete list of metadatas

Cited literature [107 references]  Display  Hide  Download
Contributor : Giancarlo Croce <>
Submitted on : Wednesday, August 5, 2020 - 11:37:07 AM
Last modification on : Wednesday, October 14, 2020 - 4:00:43 AM


Files produced by the author(s)


  • HAL Id : tel-02912097, version 1


Giancarlo Croce. Towards a genome-scale coevolutionary analysis. Bioinformatics [q-bio.QM]. Sorbonne Université, 2019. English. ⟨tel-02912097⟩



Record views


Files downloads