Analyse de la diversité microbienne par séquençage massif : méthodes et applications

Abstract : The characterization of microbial community structure via SSU rRNA gene profiling has been greatly advanced in recent years by the introduction of NGS amplicons, leading to a better representation of sample diversity at a lower cost. This progress in method development has provided a new window into the composition of microbial communities and sparked interest in the members of the rare biosphere. Concurrently, the processing of such amount of data has become an important bottleneck for the effectiveness of microbial ecology studies, and a multitude of analysis platforms have been developed for the handling of these data. As implemented, these tools have a steep learning curve for the biologist who is not computationally inclined, as they require extensive user intervention and consume many CPU hours due to dataset analysis and complexity, which can present a significant barrier to researchers. Moreover, although phylogenetic affiliation has been shown to be more accurate for the taxonomic assignment of NGS reads, the existing tools assign taxonomy by either a similarity search or a probabilistic approach, with the phylogenies being restricted to samples' comparison. Beyond the taxonomic assignment, the new sequencing technologies also arise the problem of the quality of the generated sequences and its impact on the richness estimation. In this work, we aimed to define a strategy for the bioinformatic analysis of high-throughput sequences in order to depict the microbial diversity, taking into account both the limitations imposed by current computer resources (hardware and software), and the advantage of the phylogenetic methods over the other taxonomic annotation approaches. This work has led to the development of a pipeline offering a set of analyzes ranging from raw sequences processing to the visualization of the results, while replacing the environmental sequences in an evolutionary framework. The developed approach was optimized for managing large volumes of data, and has been compared in term of the accuracy of taxonomic assignment to the approaches commonly used in the field of microbial ecology. This pipeline was then used to the developement of a dedicated web server for high-throughput sequencing analysis, that relies on a computing cluster and performs large-scale phylogeny-based analyses of rRNA genes with no need for specialized informatics expertise, and uses the phylogenies for both the taxonomy assessment and the delineation of monophyletic groups to highlight clades of interest.
Document type :
Complete list of metadatas
Contributor : Abes Star <>
Submitted on : Friday, January 10, 2014 - 2:23:45 PM
Last modification on : Friday, August 23, 2019 - 3:50:04 PM
Long-term archiving on : Thursday, April 10, 2014 - 11:20:10 PM


Version validated by the jury (STAR)


  • HAL Id : tel-00926896, version 1



Najwa Taïb. Analyse de la diversité microbienne par séquençage massif : méthodes et applications. Sciences agricoles. Université Blaise Pascal - Clermont-Ferrand II, 2013. Français. ⟨NNT : 2013CLF22374⟩. ⟨tel-00926896⟩



Record views


Files downloads