Robustess of Phylogenetic Trees

Abstract : Phylogenetic trees are used daily in many fields of biology, most notably the functional and structural study of genomes. They provide a powerful framework to study evolution but are also an abundant source of statistically challenging issues. Most, if not all, applications of phylogenetics have in common that they require accurate phylogenetic estimates. In general, accurate estimates depend on four factors: (1) appropriate selection of genes, (2) sucient data size, (3) accurate analytical method, (4) adequate taxon sampling. We present in this thesis four issues directly related to this factors. In the first part, we use concentration inequalities to upper bound the amount of data needed to choose the most accurate of two trees when the analytical model is accurate. Using Edegworth expansions, we then present a procedure to select congruent genes from a list of target genes. In the second part, we propose two procedures, based on influence function and sensitivity curves, to identify influent nucleotides and taxa, which are likely to impede the inference and lead to non-robust estimates. We show that as few as one nucleotide or taxon can have a drastic impact on the estimates, discuss the biological implication of this result and provide methods to achieve greater robustness of the trees.
Document type :
Theses
Complete list of metadatas

Cited literature [194 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00472052
Contributor : Mahendra Mariadassou <>
Submitted on : Friday, April 9, 2010 - 1:33:15 PM
Last modification on : Friday, September 20, 2019 - 4:34:03 PM
Long-term archiving on : Tuesday, September 14, 2010 - 6:16:36 PM

Identifiers

  • HAL Id : tel-00472052, version 1

Collections

Citation

Mahendra Mariadassou. Robustess of Phylogenetic Trees. Mathematics [math]. Université Paris Sud - Paris XI, 2009. English. ⟨tel-00472052⟩

Share

Metrics

Record views

390

Files downloads

649