Abstract
We study the problem of detecting all occurrences of (primitive) tandem repeats and tandem arrays in a string. We first give a simple time- and space- optimal algorithm to find all tandem repeats, and then modify it to become a time and space-optimal algorithm for finding only the primitive tandem repeats. Both of these algorithms are then extended to handle tandem arrays. The contribution of this paper is both pedagogical and practical, giving simple algorithms and implementations based on a suffix tree, using only standard tree traversal techniques.
Research supported by the German Academic Exchange Service (DAAD). E-mail: stoye@cs.ucdavis.edu
Research partially supported by grant DBI-9723346 from the National Science Foundation, and by grant DE-FG03-90ER60999 from the Department of Energy. E-mail: gusfield@cs.ucdavis.edu
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
A. Apostolico. The myriad virtues of subword trees. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume F12 of NATO ASI Series, pages 85–96. Springer Verlag, 1985.
A. Apostolico and F. P. Preparata. Optimal off-line detection of repetitions in a string. Theor. Comput. Sci., 22:297–315, 1983.
M. Crochemore. An optimal algorithm for computing the repetitions in a word. Inform. Process. Lett., 12(5):244–250, 1981.
M. Crochemore and W. Rytter. Periodic prefixes in texts. In R. Capodelli, A. De Santis, and U. Vaccaro, editors, Sequences II, pages 153–165. Springer Verlag, 1993.
M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, 1994.
M. Crochemore and W. Rytter. Squares, cubes, and time-space efficient string searching. Algorithmica, 13(5):405–425, 1995.
M. Farach. Optimal suffix tree construction with large alphabets. In Proc. 38th Annu. Symp. Found. Comput. Sci., FOCS 97, 1997. IEEE Press.
D. Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York, NY, 1997.
R. W. Irving, Personal Communication.
S. R. Kosaraju. Computation of squares in a string. In M. Crochemore and D. Gusfield, editors, Combinatorial Pattern Matching: 5th Annual Symposium, CPM 94. Proceedings, number 807 in Lecture Notes in Computer Science, pages 146–150, 1994. Springer Verlag.
G. M. Landau, Personal Communication.
G. M. Landau and J. P. Schmidt. An algorithm for approximate tandem repeats. In A. Apostolico, M. Crochemore, Z. Galil, and U. Manber, editors, Combinatorial Pattern. Matching: 4th Annual Symposium, CPM 93. Proceedings, number 684 in Lecture Notes in Computer Science, pages 120–133, 1993. Springer Verlag.
M. G. Main and R. J. Lorentz. An O (n log n) algorithm for finding all repetitions in a string. J. Algor., 5:422–432, 1984.
M. G. Main and R. J. Lorentz. Linear time recognition of squarefree strings. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume F12 of NATO ASI Series, pages 271–278. Springer Verlag, Berlin, 1985.
U. Manber and E. W. Myers. Suffix arrays: A new method for on-line search. SIAM J. Computing, 22:935–948, 1993.
E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262–272, 1976.
J. P. Schmidt, Personal Communication.
P. F. Stelling. Applications of Combinatorial Analysis to Repetitions in Strings, Phylogeny, and Parallel Multiplier Design. Ph.d. dissertation, Department of Computer Science, University of California, Davis, 1995.
E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249–260, 1995.
P. Weiner. Linear pattern matching algorithms. In IEEE 14th Annual Symposium on Switching and Automata Theory, pages 1–11. IEEE Press, 1973.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stoye, J., Gusfield, D. (1998). Simple and flexible detection of contiguous repeats using a suffix tree Preliminary Version. In: Farach-Colton, M. (eds) Combinatorial Pattern Matching. CPM 1998. Lecture Notes in Computer Science, vol 1448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030787
Download citation
DOI: https://doi.org/10.1007/BFb0030787
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64739-3
Online ISBN: 978-3-540-69054-2
eBook Packages: Springer Book Archive