HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

Apprentissage statistique pour l'intégration de données omiques

Abstract : The development of high-throughput sequencing technologies has lead to produce high dimensional heterogeneous datasets at different living scales. To process such data, integrative methods have been shown to be relevant, but still remain challenging. This thesis gathers methodological contributions useful to simultaneously explore heterogeneous multi-omics datasets. To tackle this problem, kernels and kernel methods represent a natural framework because they allow to handle the own nature of each datasets while permitting their combination. However, when the number of sample to process is high, kernel methods suffer from several drawbacks: their complexity is increased and the interpretability of the model is lost. A first part of my work is focused on the adaptation of two exploratory kernel methods: the principal component analysis (K-PCA) and the self-organizing map (K-SOM). The proposed adaptations first address the scaling problem of both K-SOM and K-PCA to omics datasets and second improve the interpretability of the models. In a second part, I was interested in multiple kernel learning to combine multiple omics datasets. The proposed methods efficiency is highlighted in the domain of microbial ecology: eight TARA oceans datasets are integrated and analysed using a K-PCA.
Complete list of metadata

Cited literature [188 references]  Display  Hide  Download

Contributor : Jérôme Mariette Connect in order to contact the contributor
Submitted on : Wednesday, December 20, 2017 - 9:44:14 AM
Last modification on : Thursday, March 17, 2022 - 3:46:01 PM


Files produced by the author(s)


  • HAL Id : tel-01666744, version 2


Jérôme Mariette. Apprentissage statistique pour l'intégration de données omiques. Bio-informatique [q-bio.QM]. UPS Toulouse - Université Toulouse 3 Paul Sabatier, 2017. Français. ⟨tel-01666744v2⟩



Record views


Files downloads