Use of data analysis techniques to solve specific bioinformatics problems

Abstract : Nowadays, the quantity of sequenced genetic data is increasing exponentially under the impetus of increasingly powerful sequencing tools, such as high-throughput sequencing tools in particular. In addition, these data are increasingly accessible through online databases. This greater availability of data opens up new areas of study that require statisticians and bioinformaticians to develop appropriate tools. In addition, constant statistical progress in areas such as clustering, dimensionality reduction, regressions and others needs to be regularly adapted to the context of bioinformatics. The objective of this thesis is the application of advanced statistical techniques to bioinformatics issues. In this manuscript we present the results of our works concerning the clustering of genetic sequences via Laplacian eigenmaps and Gaussian mixture model, the study of the propagation of transposable elements in the genome via a branching process, the analysis of metagenomic data in ecology via ROC curves or the ordinal polytomous regression penalized by the l1-norm.
Document type :
Theses
Complete list of metadatas

Cited literature [99 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02312486
Contributor : Abes Star <>
Submitted on : Friday, October 11, 2019 - 11:15:11 AM
Last modification on : Saturday, October 12, 2019 - 1:20:11 AM

File

these_A_MOULIN_Serge_2018.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02312486, version 1

Citation

Serge Moulin. Use of data analysis techniques to solve specific bioinformatics problems. Bioinformatics [q-bio.QM]. Université Bourgogne Franche-Comté, 2018. English. ⟨NNT : 2018UBFCD049⟩. ⟨tel-02312486⟩

Share

Metrics

Record views

45

Files downloads

10