Skip to Main content Skip to Navigation

Use of data analysis techniques to solve specific bioinformatics problems

Abstract : Nowadays, the quantity of sequenced genetic data is increasing exponentially under the impetus of increasingly powerful sequencing tools, such as high-throughput sequencing tools in particular. In addition, these data are increasingly accessible through online databases. This greater availability of data opens up new areas of study that require statisticians and bioinformaticians to develop appropriate tools. In addition, constant statistical progress in areas such as clustering, dimensionality reduction, regressions and others needs to be regularly adapted to the context of bioinformatics. The objective of this thesis is the application of advanced statistical techniques to bioinformatics issues. In this manuscript we present the results of our works concerning the clustering of genetic sequences via Laplacian eigenmaps and Gaussian mixture model, the study of the propagation of transposable elements in the genome via a branching process, the analysis of metagenomic data in ecology via ROC curves or the ordinal polytomous regression penalized by the l1-norm.
Document type :
Complete list of metadatas

Cited literature [99 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Friday, October 11, 2019 - 11:15:11 AM
Last modification on : Thursday, November 12, 2020 - 9:42:17 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02312486, version 1


Serge Moulin. Use of data analysis techniques to solve specific bioinformatics problems. Bioinformatics [q-bio.QM]. Université Bourgogne Franche-Comté, 2018. English. ⟨NNT : 2018UBFCD049⟩. ⟨tel-02312486⟩



Record views


Files downloads