Skip to main content

A clustering approach to Generalized Tree Alignment with application to Alu repeats

  • Sequence Analysis
  • Conference paper
  • First Online:
Bioinformatics (GCB 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1278))

Included in the following conference series:

Abstract

A formalization of the multiple sequence alignment problem that emphasizes the problem's evolutionary aspect is the Generalized Tree Alignment Problem. Given a set of sequences, this formalization asks for a phylogenetic tree and ancestral sequences such that the implied amount of change necessary to explain the given data is minimal. The problem is computationally hard and we present a heuristic algorithm for it. Our procedure mimicks agglomerative clustering techniques as used for phylogenetic trees while at the same time aligning the sequences using the data structure of sequence graphs. The approach achieves good results in terns of the underlying scoring function. It produces biologically meaningful answers which in this paper we will demonstrate on a set of Alu repeats.

Work supported by DFG grant Vi-160/1

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. O. Duda, P. E. Hart. Pattern Classification and Scene Analysis. Wiley & sons, 1973.

    Google Scholar 

  2. D.-F. Feng and R. F. Doolittle. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25:351–360, 1987.

    PubMed  Google Scholar 

  3. O. Gotoh. An improved algorithm for matching biological sequences. Journal of Molecular Biology 162:705–708, 1982.

    Article  PubMed  Google Scholar 

  4. J. Hein. A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogeny is given. Molecular Biology and Evolution 6:649–668, 1989.

    PubMed  Google Scholar 

  5. J. Hein. A Tree Reconstruction Method That Is Economical in the Number of Pairwise Comparisons Used. Molecular Biology and Evolution 6:669–684, 1989.

    PubMed  Google Scholar 

  6. J. Hein. Unified Approach to Alignment and Phylogenies. Methods in Enzymology 183:626–645, 1990.

    PubMed  Google Scholar 

  7. J. B. Kruskal and D. Sankoff. An Anthology of Algorithms and Concepts for Sequence Comparison. In: Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Addison Wesley, 1983.

    Google Scholar 

  8. K. Mehlhorn and S. Näher. LEDA, a Platform for Combinatorial and Geometric Computing. Communications of the ACM 38:1,96–102, 1995.

    Article  Google Scholar 

  9. S. B. Needleman, C. D. Wunsch. A general method applicable to the search for similarities in the amino-acid sequence of two proteins. Journal of MolecularBiology 48:443–453, 1970.

    Google Scholar 

  10. N. Saitou and M. Nei. The Neighbor-joining Method: A New Method for Reconstructing Phylogenetic Trees. Molecular Biology and Evolution 4:406–425, 1987.

    PubMed  Google Scholar 

  11. D. Sankoff. Minimal Mutation Trees of sequences. SIAM Journal of Applied Mathematics 28:35–42,1975.

    Article  Google Scholar 

  12. D. Sankoff, R. Cedergren and G. Lapalme. Frequency of insertion-deletion, transversion, and transition in the evolution of 5S ribosomal RNA. Journal of Molecular Evolution 7:133–149,1976.

    Article  PubMed  Google Scholar 

  13. B. Schwikowski and M. Vingron. The Deferred Path Heuristic for the Generalized Tree Alignment Problem. To appear in: Proceedings of the First Annual International Conference on Computational Molecular Biology, ACM 1997.

    Google Scholar 

  14. D. L. Swofford and G. J. Olsen. Phylogeny Reconstruction. In: Molecular Systematics. Sinauer, 1990.

    Google Scholar 

  15. Willie R. Taylor. A Flexible Method to Align Large Numbers of Biological Sequences. J. Mol. Evol. 28:161–169, 1988.

    Article  PubMed  Google Scholar 

  16. A. K. C. Wong, S. C. Chan and D. K. Y. Chiu. A Multiple Sequence Comparison Method. Bull. Math. Biol. 55:465–486, 1993.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ralf Hofestädt Thomas Lengauer Markus Löffler Dietmar Schomburg

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schwikowski, B., Vingron, M. (1997). A clustering approach to Generalized Tree Alignment with application to Alu repeats. In: Hofestädt, R., Lengauer, T., Löffler, M., Schomburg, D. (eds) Bioinformatics. GCB 1996. Lecture Notes in Computer Science, vol 1278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033210

Download citation

  • DOI: https://doi.org/10.1007/BFb0033210

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63370-9

  • Online ISBN: 978-3-540-69524-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics