Skip to Main content Skip to Navigation
Theses

Comparison of homologous protein sequences using direct coupling information by pairwise Potts model alignments

Hugo Talibart 1, 2
Abstract : To assign structural and functional annotations to the ever increasing amount of sequenced proteins, the main approach relies on sequence-based homology search methods based on significant alignments of query sequences to annotated proteins or protein families. While powerful, existing approaches do not take coevolution between residues into account. Taking advantage of recent advances in the field of contact prediction, in this thesis we propose to represent proteins by Potts models, which model direct couplings between positions in addition to positional composition, and to compare proteins by aligning these models. This novel application of Potts models raised further requirements for their construction, and we identified several key points towards building more comparable Potts models, towards an ideal of canonicity. Due to non-local dependencies, the problem of aligning Potts models is NP-hard. Here, we introduced a method based on an Integer Linear Programming formulation of the problem which can be optimally solved in tractable time. Our first results suggest that taking pairwise couplings into account can improve the alignment of remote homologs and could thus improve remote homology detection.
Document type :
Theses
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03376771
Contributor : Abes Star :  Contact
Submitted on : Wednesday, October 13, 2021 - 4:50:31 PM
Last modification on : Tuesday, November 16, 2021 - 1:48:23 PM

File

TALIBART_Hugo.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03376771, version 1

Citation

Hugo Talibart. Comparison of homologous protein sequences using direct coupling information by pairwise Potts model alignments. Bioinformatics [q-bio.QM]. Université Rennes 1, 2021. English. ⟨NNT : 2021REN1S031⟩. ⟨tel-03376771⟩

Share

Metrics

Record views

66

Files downloads

66