Bioinformatic analysis of the genomes of epidemic pseudomonas aeruginosa

Panisa Treepong

Résumé

Pseudomonas aeruginosa is a major nosocomial pathogen with ST235 being the most prevalent of the so-called ‘international’ or ‘high-risk’ clones. This clone is associated with poor clinical outcomes in part due to multi- and high-level antibiotic resistance. Despite its clinical importance, the molecular basis for the success of the ST235 clone is poorly understood. Thus this thesis aimed to understand the origin of ST235 and the molecular basis for its success, including the design of bioinformatics tools for finding insertion sequences (IS) of bacterial genomes.To fulfill these objectives, this thesis was divided into 2 parts.First, the genomes of 79 P. aeruginosa ST235 isolates collected worldwide over a 27-year period were examined. A phylogenetic network was built using Hamming distance-based method, namely the NeighborNet. Then we have found the Time to the Most Recent Common Ancestor (TMRCA) by applying a Bayesian approach. Additionally, we have identified antibiotic resistance determinants, CRISPR-Cas systems, and ST235-specific genes profiles. The results suggested that the ST235 sublineage emerged in Europe around 1984, coinciding with the introduction of fluoroquinolones as an antipseudomonal treatment. The ST235 sublineage seemingly spreads from Europe via two independent clones. ST235 isolates then appeared to acquire resistance determinants to aminoglycosides, β-lactams, and carbapenems locally. Additionally, all the ST235 genomes contained the exoU-encoded exotoxin and identified 22 ST235-specific genes clustering in blocks and implicated in transmembrane efflux, DNA processing and bacterial transformation. These unique genes may have contributed to the poor outcome associated with P. aeruginosa ST235 infections and increased the ability of this international clone to acquire mobile resistance elements.The second part was to design a new Insertion Sequence (IS) searching tool on next-generation sequencing data, named panISa. This tool identifies the IS position, direct target repeats (DR) and inverted repeats (IR) from short read data (.bam/.sam) by investigating only the reference genome (without any IS database). To validate our proposal, we used simulated reads from 5 species: Escherichia coli, Mycobacterium tuberculosis, Pseudomonas aeruginosa, Staphylococcus aureus, and Vibrio cholerae with 30 random ISs. The experiment set is constituted by reads of various lengths (100, 150, and 300 nucleotides) and coverage of simulated reads at 20x, 40x, 60x, 80x, and 100x. We performed sensitivity and precision analyses to evaluate panISa and found that the sensitivity of IS position is not significantly different when the read length is changed, while the modifications become significant depending on species and read coverage. When focusing on the different read coverage, we found a significant difference only at 20x. For the other situations (40x-100x) we obtained a very good mean of sensitivity equal to 98% (95%CI: 97.9%-98.2%). Similarly, the mean of DR sensitivity of DR identification is high: 99.98% (95%CI: 99.957%-99.998%), but the mean of IR sensitivity is 73.99% (95%CI: 71.162%-76.826%), which should be improved. Focusing on precision instead of sensibility, the precision of IS position is significantly different when changing the species, read coverage, or read length. However, the mean of each precision value is larger than 95%, which is very good.In conclusion, P. aeruginosa ST235 (i) has become prevalent across the globe potentially due to the selective pressure of fluoroquinolones and (ii) readily became resistant to aminoglycosides, β-lactams, and carbapenems through mutation and acquisition of resistance elements among local populations. Concerning the second point, our panISa proposal is a sensitive and highly precise tool for identifying insertion sequences from short reads of bacterial data, which will be useful to study the epidemiology or bacterial evolution.

Le Pseudomonas aeruginosa est un pathogène nosocomial majeur. Le clone ST235 est le plus prévalent des clones internationaux dits à hautris que. Ce clone est très fréquemment multi résistant aux antibiotiques, ce qui complique la prise en charge des infections dont il est à l’origine.Malgré son importance clinique, la base moléculaire Du succès du clone ST235 n’est pas comprise.Dans ce travail, nous avons cherché à comprendre l’origine spacio temporelle de ce clone et les bases moléculaires de son succès. A l’aide d’outils bio informatiques existants ,nous avons trouvé que le clone ST235 a émergé en Europe en 1984 et que tous les isolates ST235 produisent l’exotoxine ExoU. Nous avons également identifié 22 gènes Contigus spécifiques de ce clone et impliqués dans l’efflux transmembranaire, dans le traitement de l’ADN et dans la transformation bactérienne. Cette combinaison unique de gènes a pu contribuer à la gravité des infections dues à ce clone et à sa capacité à acquérir des gènes de résistance aux antibiotiques. Ainsi, la diffusion mondiale de ce clone a probablement été favorisée par l’utilisation extensive des fluoroquinolones, puis il est de venu localement résistant aux amino glycosides, aux β-lactamines, et aux carbapénèmes par mutation et acquisition d’éléments de résistance. Nous avons majoritairement utilisé des outils existants,mais avons découvert que les programmes de détection des séquences d’insertions (IS, ayant un rôle important dans l’évolution des génomes bactériens) ne sont pas adaptés aux données dont nous disposions. Nous avons ainsi mis au point un outil (appelé panISa) qui détecte de façon précise et sensible les IS à partir de données brutes de séquençage de génomes bactériens.

Bioinformatic analysis of the genomes of epidemic pseudomonas aeruginosa

Analyse bioinformatique des génomes d'une souche épidémique de pseudomonas aeruginosa

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager