Skip to Main content Skip to Navigation
Theses

New methods for biological sequence alignment

Marta Gîrdea 1, 2
2 SEQUOIA2 - Algorithms for large scale sequence analysis
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : Biological sequence alignment is a fundamental technique in bioinformatics, and consists of iden- tifying series of similar (conserved) characters that appear in the same order in both sequences, and eventually deducing a set of modifications (substitutions, insertions and deletions) involved in the transformation of one sequence into the other. This technique allows one to infer, based on sequence similarity, if two or more biological sequences are potentially homologous, i.e. if they share a common ancestor, thus enabling the understanding of sequence evolution. This thesis addresses sequence comparison problems in two different contexts: homology detection and high throughput DNA sequencing. The goal of this work is to develop sensitive alignment methods that provide solutions to the following two problems: i) the detection of hidden protein homologies by protein sequence comparison, when the source of the divergence are frameshift mutations, and ii) mapping short SOLiD reads (sequences of overlapping di- nucleotides encoded as colors) to a reference genome. In both cases, the same general idea is applied: to implicitly compare DNA sequences for detecting changes occurring at this level, while manipulating, in practice, other representations (protein sequences, sequences of di-nucleotide codes) that provide additional information and thus help to improve the similarity search. The aim is to design and implement exact and heuristic alignment methods, along with scoring schemes, adapted to these scenarios.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-00833311
Contributor : Laurent Noé <>
Submitted on : Wednesday, June 12, 2013 - 2:14:28 PM
Last modification on : Monday, October 19, 2020 - 10:52:07 AM

Identifiers

  • HAL Id : tel-00833311, version 1

Collections

Citation

Marta Gîrdea. New methods for biological sequence alignment. Bioinformatics [q-bio.QM]. Université des Sciences et Technologie de Lille - Lille I, 2010. English. ⟨tel-00833311⟩

Share

Metrics

Record views

479

Files downloads

1321