Evolution of whole genomes through inversions: models and algorithms for duplicates, ancestors, and edit scenarios

Swenson, Krister

doi:10.5075/epfl-thesis-4552

Swenson, Krister

2009

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Advances in sequencing technology are yielding DNA sequence data at an alarming rate – a rate reminiscent of Moore's law. Biologists' abilities to analyze this data, however, have not kept pace. On the other hand, the discrete and mechanical nature of the cell life-cycle has been tantalizing to computer scientists. Thus in the 1980s, pioneers of the field now called Computational Biology began to uncover a wealth of computer science problems, some confronting modern Biologists and some hidden in the annals of the biological literature. In particular, many interesting twists were introduced to classical string matching, sorting, and graph problems. One such problem, first posed in 1941 but rediscovered in the early 1980s, is that of sorting by inversions (also called reversals): given two permutations, find the minimum number of inversions required to transform one into the other, where an inversion inverts the order of a subpermutation. Indeed, many genomes have evolved mostly or only through inversions. Thus it becomes possible to trace evolutionary histories by inferring sequences of such inversions that led to today's genomes from a distant common ancestor. But unlike the classic edit distance problem where string editing was relatively simple, editing permutation in this way has proved to be more complex. In this dissertation, we extend the theory so as to make these edit distances more broadly applicable and faster to compute, and work towards more powerful tools that can accurately infer evolutionary histories. In particular, we present work that for the first time considers genomic distances between any pair of genomes, with no limitation on the number of occurrences of a gene. Next we show that there are conditions under which an ancestral genome (or one close to the true ancestor) can be reliably reconstructed. Finally we present new methodology that computes a minimum-length sequence of inversions to transform one permutation into another in, on average, O(n log n) steps, whereas the best worst-case algorithm to compute such a sequence uses O(n√n log n) steps.

Details

Title Evolution of whole genomes through inversions: models and algorithms for duplicates, ancestors, and edit scenarios

Author(s) Swenson, Krister

Advisor(s)

Moret, Bernard M. E.

Pagination 101

Date 2009

Publisher Lausanne, EPFL

Keywords

inversions; reversals; sorting; pairwise distance; duplications; median; genomes; evolution; phylogeny; orthology; positional homology

Language English

DOI https://doi.org/10.5075/epfl-thesis-4552

Other identifier(s) urn: urn:nbn:ch:bel-epfl-thesis4552-7

Laboratories LCBB

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IC Archives > LCBB - Laboratory for Computational Biology and Bioinformatics
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2009-10-15

Files

Abstract

Details

PDF