Modèles combinatoires des structures d'ARN avec ou sans pseudonoeuds, application à la comparaison de structures.

Cédric Saule 1, 2
2 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
Abstract : This thesis proposes a model of RNA secondary structures with or without pseudoknots. According to a combinatorial approach, we design different models of these structures which we study according to two aspects. In one hand, we define random generation models which allow us to define a measure allowing a better recognition of biological structures. On the other hand, greatings to appropriated encodings and bijections to languages represented by non-contextual grammars, we count the structures composing the space of exact secondary structure prediction algorithms with pseudoknots. The first part deals with random models of RNA structures without pseudoknots. We show that these structures are a relevant source of random noise when determining whether the structures comparison softwares attribute a better comparison score between structures from the same family than alignments between real and random structures. We then compare the sensitivity and spcificity of RNAdistance, a structures comparison software, depending on the use of the "raw" score or on the Z-value. We compute several Z-values according to different models of random structures. We show that the Z-value computed from a Markov model improves the detection of large RNA while the Z-value computed from a model based on weighted grammars improves the detection of small RNA. We then consider, in the other hand, pseudoknotted secondary structure prediction algorithms. First we complete the Condon it et al. classification by describing the structures by their consistancy graph and we also characterize the planar restriction of the class of Rivas and Eddy. Then, we investigate the tradeoff between the complexity of existing algorithms and the size of their prediction space. We count the structures by coding them with words of algebraic languages. Then, we deduce asymptotic formulas count. We also show a bijection between the class of Lyngso and Pedersen and planar maps and a bijection between the class of undifferentiated pseudoknots we introduced and ternary trees.We show that the observed differences in complexity prediction algorithms are not always justified by the size of the space prediction. From these grammars, we design efficient algorithms for generating random RNA structure, uniform or controlled non-uniform, with pseudoknots.
Complete list of metadatas

Cited literature [136 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00788467
Contributor : Mireille Regnier <>
Submitted on : Friday, February 15, 2013 - 12:22:54 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:29 PM
Long-term archiving on : Thursday, May 16, 2013 - 3:57:24 AM

Identifiers

  • HAL Id : tel-00788467, version 1

Collections

Citation

Cédric Saule. Modèles combinatoires des structures d'ARN avec ou sans pseudonoeuds, application à la comparaison de structures.. Bio-informatique [q-bio.QM]. Université Paris Sud - Paris XI, 2011. Français. ⟨tel-00788467⟩

Share

Metrics

Record views

447

Files downloads

536