Combinatorial aspects of genome rearrangements and haplotype networks

Abstract : The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks. Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing the minimum number of operations rather than an optimal sequence. Genome rearrangement problems can often be formulated as sorting problems on permutations (viewed as linear orderings of {1,2,...,n}) using as few (allowed) operations as possible. In this thesis, we focus among other operations on ``transpositions', which displace intervals of a permutation. Many questions related to sorting by transpositions are open, related in particular to its computational complexity. We use the disjoint cycle decomposition of permutations, rather than the ``standard tools' used in genome rearrangements, to prove new upper bounds on the transposition distance, as well as formulae for computing the exact distance in polynomial time in many cases. This decomposition also allows us to solve a counting problem related to the ``cycle graph' of Bafna and Pevzner, and to construct a general framework for obtaining lower bounds on any edit distance between permutations by recasting their computation as factorisation problems on related even permutations. Haplotype networks are graphs in which a subset of vertices is labelled, used in comparative genomics as an alternative to trees. We formalise a new method due to Cassens, Mardulyn and Milinkovitch, which consists in building a graph containing a given set of partially labelled trees and with as few edges as possible. We give exact algorithms for solving the problem on two graphs, with an exponential running time in the general case but with a polynomial running time if at least one of the graphs belong to a particular class.
Document type :
Theses
Complete list of metadatas

Cited literature [4 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00482196
Contributor : Anthony Labarre <>
Submitted on : Sunday, May 9, 2010 - 11:50:59 PM
Last modification on : Monday, May 10, 2010 - 8:40:32 AM
Long-term archiving on : Friday, October 19, 2012 - 2:35:51 PM

Identifiers

  • HAL Id : tel-00482196, version 1

Collections

Citation

Anthony Labarre. Combinatorial aspects of genome rearrangements and haplotype networks. Computer Science [cs]. Université Libre de Bruxelles, 2008. English. ⟨tel-00482196⟩

Share

Metrics

Record views

180

Files downloads

316