Reconstitution de pan-génomes microbiens par séquençage métagénomique aléatoire : Application à l’étude du microbiote intestinal humain

Abstract : The advent of shotgun metagenomic sequencing has revolutionized microbiology by allowing culture-independent characterization of complex microbial communities such as the human gut microbiota. Recently developed bioinformatics tools achieved strain-level resolution by making a census of accessory genes or by capturing nucleotide variants (SNPs). Yet, these tools are hampered by the extent of available reference genomes which are far from covering all the microbial variability. Indeed, many species are still not sequenced or are represented by only few genomes.Building of non-redundant gene catalogs followed by the binning of co-abundant genes reveals a part of the microbial dark matter by reconstituting the gene repertoire of species potentially unknown. While existing methods accurately identify core genes present in all the strains of a species, they miss many accessory genes or split them into small gene groups that remain unassociated to core genomes. However, capturing these accessory genes is essential in clinical research and epidemiology because they provide functions specific to certain strains such as pathogenicity or antibiotic resistance.In this thesis, we developed MSPminer, a computationally efficient software tool that reconstitutes Metagenomic Species Pan-genomes (MSPs) by binning co-abundant genes across metagenomic samples. MSPminer relies on a new robust measure of proportionality coupled with an empirical classifier to group and distinguish not only species core genes but accessory genes also.With MSPminer, we structured a catalog made up of 9.9 million genes of the human gut microbiota in 1 661 MSPs. The homogeneity of the taxonomic annotation, of the nucleotide composition as well as the presence of essential genes indicate that the MSPs do not correspond to chimeras but to biologically consistent objects grouping genes from the same species. Among these MSPs, 1 301 (78%) could not be annotated at species level showing that many microorganisms colonizing the human intestinal tract are still unknown despite the substantial improvements of microbial culture techniques. Remarkably, MSPs capture more genes than clusters generated by existing tools while ensuring high specificity.This set of MSPs can be readily used for taxonomic profiling and biomarkers discovery in human gut metagenomic samples. In this way, we take advantage of the MSPs to compare the impact of two main types of surgeries, the laparoscopic sleeve gastrectomy (LSG) and the Roux-En-Y gastric bypass (LRYGB). Finally, the MSPs open the way to strain-level analyses. In another cohort, we identified subspecies associated the host geographical origin by studying presence/absence patterns of the accessory genes grouped in the MSPs.
Complete list of metadatas

Cited literature [206 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02274206
Contributor : Abes Star <>
Submitted on : Thursday, August 29, 2019 - 3:43:06 PM
Last modification on : Saturday, August 31, 2019 - 1:12:38 AM

File

71142_PLAZA_ONATE_2018_archiva...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02274206, version 1

Collections

Citation

Florian Plaza Onate. Reconstitution de pan-génomes microbiens par séquençage métagénomique aléatoire : Application à l’étude du microbiote intestinal humain. Bio-informatique [q-bio.QM]. Université Paris-Saclay, 2018. Français. ⟨NNT : 2018SACLV068⟩. ⟨tel-02274206⟩

Share

Metrics

Record views

61

Files downloads

32