Classification of RNA Pseudoknots and Comparison of Structure Prediction Methods

Abstract : Lots of researches convey the importance of the RNA molecules, as they play vital roles in many molecular procedures. And it is commonly believed that the structures of the RNA molecules hold the key to the discovery of their functions.During the investigation of RNA structures, the researchers are dependent on the bioinformatical methods increasingly. Many in silico methods of predicting RNA secondary structures have emerged in this big wave, including some ones which are capable of predicting pseudoknots, a particular type of RNA secondary structures.The purpose of this dissertation is to try to compare the state-of-the-art methods predicting pseudoknots, and offer the colleagues some insights into how to choose a practical method for the given single sequence. In fact, lots of efforts have been done into the prediction of RNA secondary structures including pseudoknots during the last decades, contributing to many programs in this field. Some challenging questions are raised consequently. How about the performance of each method, especially on a particular class of RNA sequences? What are their advantages and disadvantages? What can we benefit from the contemporary methods if we want to develop new ones? This dissertation holds the confidence in the investigation of the answers.This dissertation carries out quite many comparisons of the performance of predicting RNA pseudoknots by the available methods. One main part focuses on the prediction of frameshifting signals by two methods principally. The second main part focuses on the prediction of pseudoknots which participate in much more general molecular activities.In detail, the second part of work includes 414 pseudoknots, from both the Pseudobase and the Protein Data Bank, and 15 methods including 3 exact methods and 12 heuristic ones. Specifically, three main categories of complexity measurements are introduced, which further divide the 414 pseudoknots into a series of subclasses respectively. The comparisons are carried out by comparing the predictions of each method based on the entire 414 pseudoknots, and the subsets which are classified by both the complexity measurements and the length, RNA type and organism of the pseudoknots.The result shows that the pseudoknots in nature hold a relatively low complexity in all measurements. And the performance of contemporary methods varies from subclass to subclass, but decreases consistently as the complexity of pseudoknots increases. More generally, the heuristic methods globally outperform the exact ones. And the susceptible assessment results are dependent strongly on the quality of the reference structures and the evaluation system. Last but not least, this part of work is provided as an on-line benchmark for the bioinformatics community.
Document type :
Complete list of metadatas
Contributor : Abes Star <>
Submitted on : Monday, April 25, 2016 - 4:04:07 PM
Last modification on : Tuesday, April 24, 2018 - 1:38:20 PM
Long-term archiving on : Monday, November 14, 2016 - 1:21:15 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01297053, version 1


Cong Zeng. Classification of RNA Pseudoknots and Comparison of Structure Prediction Methods. Bioinformatics [q-bio.QM]. Université Paris Sud - Paris XI, 2015. English. ⟨NNT : 2015PA112127⟩. ⟨tel-01297053⟩



Record views


Files downloads