Algorithmes de graphes pour la recherche de motifs récurrents dans les structures tertiaires d'ARN

Mahassine Djelloul 1, 2
2 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
Abstract : Understanding space folding is a key issue to determine the function of a RNA molecule. A RNA tertiary structure can be modelized by a graph, with labels on edges and vertices. According to this model, a set of repetitive motifs is a set of similar subgraphs with a common structure, not known a priori. Searching for such a structure necessitates the search of a maximal common subgraph, that is known to be NP-hard. The two main contributions of my thesis are (1) a new similarity measure for graphs that allows for identification of occurrences that are similar to a motif candidate. This measure is computed by an algorithm that detects a maximum common subgraph with some specific properties. (2) a new method to automatically extract and classify (families of) similar RNA motifs. Basically, one extracts a RNA graph that represents the tertiary structure for subgraphs that may contain motifs. Then, similar subgraphs are grouped in distinct clusters, according to the newly defined measure. Exist two types of recurrent tertiary motifs:local motifs that are inserted in secondary structure elements and interaction motifs that take into account two or more elements from a secondary structure. Up to a small adaptation, proposed similarity measure adapts successfully to both types. Classification and extraction methods have been tested on a representative sample of RNA structures. Results have been expertised by biochimists from IMBC (Strasbourg) and are available on line.
Complete list of metadatas

Cited literature [73 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00785953
Contributor : Mireille Regnier <>
Submitted on : Thursday, February 7, 2013 - 1:33:37 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:29 PM
Long-term archiving on : Wednesday, May 8, 2013 - 3:54:31 AM

Identifiers

  • HAL Id : tel-00785953, version 1

Collections

Citation

Mahassine Djelloul. Algorithmes de graphes pour la recherche de motifs récurrents dans les structures tertiaires d'ARN. Bio-informatique [q-bio.QM]. Université Paris Sud - Paris XI, 2009. Français. ⟨tel-00785953⟩

Share

Metrics

Record views

729

Files downloads

1602