Abstract
Calculation of dot-matrices is a widespread tool in pairwise sequence comparison. In recent studies the usefulness of dot-matrices for multiple sequence alignment has been proved. Viewing dot-matrices as projections of unknown n-dimensional points, we consider the multiple alignment problem (for n sequences) as an n-dimensional image reconstruction problem with noise. From this perspective we introduce and develop the filtering method due to Vingron and Argos (J. Mol. Biol. (1991), 218, pp. 33–43). We discuss a conjecture of theirs regarding the number of iterations their algorithm requires and demonstrate that this number may be large. An improved version of the original algorithm is introduced that avoids costly dot-matrix multiplications and runs in O(n 3·L3) time (L is the length of the longest sequence). This is equivalent to only one iteration of the original algorithm. We also discuss applications to DNA/protein sequence comparisons.
This research was supported by grants from the National Science Foundation (DMS 90-05833, DMS 90-05833) and the National Institute of Health (GM-36230). This paper was written when P.A.P. was at the Department of Mathematics, University of Southern California, Los Angeles.
Preview
Unable to display preview. Download preview PDF.
References
Aho, A.V., Hopcroft, J.E. and Ullman: The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.
Altschul, S.F and Lipman, D.J. (1989) Trees, stars, and multiple biological sequence alignment. SIAM J. Appl. Math., 49, 179–209.
Argos, P. (1987) A Sensitive Procedure to Compare Amino Acid Sequences. J. Mol. Biol., 193, 385–396.
Bern, M. and Eppstein, D. (1992) Mesh Generation and Optimal Triangulation. In: Computing in Euclidean Geometry, F.K. Hwang and D.-Z. Du, editors, World Scientific, 1992, 23–90.
Carillo, H. and Lipman, D. (1988) The multiple sequence alignment problem in biology. SIAM J. Appl. Math., 48, 1073–1082.
Chan, S.C., Wong, A.K.C. and Chiu, D.K.Y. (1992) A survey of multiple sequence comparison methods. Bull. Math. Biol., 54, 563–598.
Gotoh, O. (1986) Alignment of Three Biological Sequences with an Efficient Trace-back Procedure. J. Theor. Biol., 121, 327–337.
Gotoh, O. (1990) Consistency of Optimal Sequence Alignments. Bull. Math. Biol., 52, 509–525.
Karlin,S., Morris,M., Ghandour,G. and Leung,M.-Y. (1988) Efficient algorithms for molecular sequence analysis. Proc. Nat. Acad. Sci. U.S.A., 85, 841–845.
Maizel, J.V. and Lenk R.P. (1981) Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc. Nat. Acad. Sci. USA, 78, 7665–7669.
McMorris, F.R. and Meacham, C.F. (1983) Partition intersection graphs. Ars Combin., 16-B, 133–138.
Miller, W. (1993) Building multiple alignments from pairwise alignments. Comp. Appl. Biosci., to appear.
Murata, M., Richardson, J.S. and Sussman, J.L. (1985) Simultaneous comparison of three protein sequences. Proc. Nat. Acad. Sci. U.S.A., 82, 3073–3077.
Pevzner P.A. Multiple alignment, communication cost, and graph matchings. SIAM J. Appl. Math., 52, 1992, 1763–1779.
Roytberg, M.A. (1992) A search for common patterns in many sequences. Comp. Appl. Biosci., 8, 57–64.
Schuler, G.D., Altschul S.F. and Lipman, D.J. (1991) A Workbench for Multiple Sequence Alignment Construction and Analysis. PROTEINS: Structure, Function and Genetics, 9, 180–190.
Sobel, E. and Martinez, H. (1986) A multiple sequence alignment program. Nucleic Acids Res., 14, 363–374.
Vihinen, M. (1988) An algorithm for simultaneous comparison of several sequences. Comp. Appl. Biosci., 4, 89–92.
Vingron, M. and Argos, P. (1989) A fast and sensitive multiple sequence alignment algorithm. Comp. Appl. Biosci., 5, 115–121.
Vingron and Argos (1991) Motif recognition and alignment for many sequences by comparison of dot-matrices J. Mol. Biol. 218, pp. 33–43.
Waterman, M.S. (1984) General methods of sequence comparison. Bull. Math. Biol., 46, 473–500.
Waterman, M.S., Arratia, R. and Galas, D.J. (1984) Pattern recognition in several sequences: Consensus and Alignment. Bull. Math. Biol., 46, 515–527.
Waterman, M.S. and Eggert, M. (1987) A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisons. J. Mol. Biol., 197, 723–725.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vingron, M., Pevzner, P.A. (1993). Multiple sequence comparison and n-dimensional image reconstruction. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1993. Lecture Notes in Computer Science, vol 684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029809
Download citation
DOI: https://doi.org/10.1007/BFb0029809
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56764-6
Online ISBN: 978-3-540-47732-7
eBook Packages: Springer Book Archive