Abstract
A formalization of the multiple sequence alignment problem that emphasizes the problem's evolutionary aspect is the Generalized Tree Alignment Problem. Given a set of sequences, this formalization asks for a phylogenetic tree and ancestral sequences such that the implied amount of change necessary to explain the given data is minimal. The problem is computationally hard and we present a heuristic algorithm for it. Our procedure mimicks agglomerative clustering techniques as used for phylogenetic trees while at the same time aligning the sequences using the data structure of sequence graphs. The approach achieves good results in terns of the underlying scoring function. It produces biologically meaningful answers which in this paper we will demonstrate on a set of Alu repeats.
Work supported by DFG grant Vi-160/1
Preview
Unable to display preview. Download preview PDF.
References
R. O. Duda, P. E. Hart. Pattern Classification and Scene Analysis. Wiley & sons, 1973.
D.-F. Feng and R. F. Doolittle. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25:351–360, 1987.
O. Gotoh. An improved algorithm for matching biological sequences. Journal of Molecular Biology 162:705–708, 1982.
J. Hein. A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogeny is given. Molecular Biology and Evolution 6:649–668, 1989.
J. Hein. A Tree Reconstruction Method That Is Economical in the Number of Pairwise Comparisons Used. Molecular Biology and Evolution 6:669–684, 1989.
J. Hein. Unified Approach to Alignment and Phylogenies. Methods in Enzymology 183:626–645, 1990.
J. B. Kruskal and D. Sankoff. An Anthology of Algorithms and Concepts for Sequence Comparison. In: Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Addison Wesley, 1983.
K. Mehlhorn and S. Näher. LEDA, a Platform for Combinatorial and Geometric Computing. Communications of the ACM 38:1,96–102, 1995.
S. B. Needleman, C. D. Wunsch. A general method applicable to the search for similarities in the amino-acid sequence of two proteins. Journal of MolecularBiology 48:443–453, 1970.
N. Saitou and M. Nei. The Neighbor-joining Method: A New Method for Reconstructing Phylogenetic Trees. Molecular Biology and Evolution 4:406–425, 1987.
D. Sankoff. Minimal Mutation Trees of sequences. SIAM Journal of Applied Mathematics 28:35–42,1975.
D. Sankoff, R. Cedergren and G. Lapalme. Frequency of insertion-deletion, transversion, and transition in the evolution of 5S ribosomal RNA. Journal of Molecular Evolution 7:133–149,1976.
B. Schwikowski and M. Vingron. The Deferred Path Heuristic for the Generalized Tree Alignment Problem. To appear in: Proceedings of the First Annual International Conference on Computational Molecular Biology, ACM 1997.
D. L. Swofford and G. J. Olsen. Phylogeny Reconstruction. In: Molecular Systematics. Sinauer, 1990.
Willie R. Taylor. A Flexible Method to Align Large Numbers of Biological Sequences. J. Mol. Evol. 28:161–169, 1988.
A. K. C. Wong, S. C. Chan and D. K. Y. Chiu. A Multiple Sequence Comparison Method. Bull. Math. Biol. 55:465–486, 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schwikowski, B., Vingron, M. (1997). A clustering approach to Generalized Tree Alignment with application to Alu repeats. In: Hofestädt, R., Lengauer, T., Löffler, M., Schomburg, D. (eds) Bioinformatics. GCB 1996. Lecture Notes in Computer Science, vol 1278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033210
Download citation
DOI: https://doi.org/10.1007/BFb0033210
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63370-9
Online ISBN: 978-3-540-69524-0
eBook Packages: Springer Book Archive