Abstract
A perfect tandem repeat within a string S is a substring r = r 1,... r2l of S, for which r 1 ... rl = rl+1 ... r2l. An approximate tandem repeat is a substring r = r 1,..., rl′,... rl, for which r 1,..., rl′ and r l′+1, ... rl are similar. In this paper we consider two criterions of similarity: the Hamming distance (k mismatches) and the edit distance (k differences). For a string S of length n and an integer k our algorithm reports all locally optimal approximate repeats, r = ūû, for which the Hamming distance of ū and û is at most k in O(nk log (n/k)) time, or all those for which the edit distance of ū and û is at most k, in O(nk log k log n) time.
Partially supported by the New York State Science and Technology Foundation Center for Advanced Technology.
Partially supported by NSF grant CCR-9110255 and the New York State Science and Technology Foundation Center for Advanced Technology.
Preview
Unable to display preview. Download preview PDF.
References
A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1974.
A. Apostolico (1992) “Fast Parallel Detection of Squares in Strings,” Algorithmica, Vol. 8, pp. 285–319.
S. K. Kannan and E. W. Myers (1993) “An algorithm for locating nonoverlapping regions of maximum alignment score,” these proceedings.
V. I. Levenshtein (1966) “Binary Codes Capable of Correcting Deletions, Insertions and Reversals,” Soviet Phys. Dokl, Vol. 10, pp. 707–710.
G.M. Landau and U. Vishkin (1988) “Fast string matching with k differences,” JCSS, Vol. 37, No. 1, pp. 63–78.
E. Myers (1986) “Incremental Alignment Algorithms and Their Applications,” Tech. Rep. 86-22, Dept. of Computer Science, U. of Arizona, Tucson, AZ 85721.
M.G. Main and R.J. Lorentz (1984) “An O(n log n) algorithm for finding all repetitions in a string,” J. of Algorithms, Vol. 5, pp. 422–432.
M.G. Main and R.J. Lorentz (1985) “Linear time recognition of square free strings,” A. Apostolico and Z. Galil (editors), Combinatorial Algorithms on Words, NATO ASI Series, Series F: Computer and System Sciences, Vol. 12, Springer-Verlag, pp. 272–278.
E. Ukkonen (1983) “On approximate string matching,” Proc. Int. Conf. Found. Comp. Theor., Lecture Notes in Computer Science 158, Springer-Verlag, pp. 487–495.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Landau, G.M., Schmidt, J.P. (1993). An algorithm for approximate tandem repeats. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1993. Lecture Notes in Computer Science, vol 684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029801
Download citation
DOI: https://doi.org/10.1007/BFb0029801
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56764-6
Online ISBN: 978-3-540-47732-7
eBook Packages: Springer Book Archive