Skip to main content

Simple and flexible detection of contiguous repeats using a suffix tree Preliminary Version

  • Session IV
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1448))

Abstract

We study the problem of detecting all occurrences of (primitive) tandem repeats and tandem arrays in a string. We first give a simple time- and space- optimal algorithm to find all tandem repeats, and then modify it to become a time and space-optimal algorithm for finding only the primitive tandem repeats. Both of these algorithms are then extended to handle tandem arrays. The contribution of this paper is both pedagogical and practical, giving simple algorithms and implementations based on a suffix tree, using only standard tree traversal techniques.

Research supported by the German Academic Exchange Service (DAAD). E-mail: stoye@cs.ucdavis.edu

Research partially supported by grant DBI-9723346 from the National Science Foundation, and by grant DE-FG03-90ER60999 from the Department of Energy. E-mail: gusfield@cs.ucdavis.edu

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Apostolico. The myriad virtues of subword trees. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume F12 of NATO ASI Series, pages 85–96. Springer Verlag, 1985.

    Google Scholar 

  2. A. Apostolico and F. P. Preparata. Optimal off-line detection of repetitions in a string. Theor. Comput. Sci., 22:297–315, 1983.

    Google Scholar 

  3. M. Crochemore. An optimal algorithm for computing the repetitions in a word. Inform. Process. Lett., 12(5):244–250, 1981.

    Google Scholar 

  4. M. Crochemore and W. Rytter. Periodic prefixes in texts. In R. Capodelli, A. De Santis, and U. Vaccaro, editors, Sequences II, pages 153–165. Springer Verlag, 1993.

    Google Scholar 

  5. M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, 1994.

    Google Scholar 

  6. M. Crochemore and W. Rytter. Squares, cubes, and time-space efficient string searching. Algorithmica, 13(5):405–425, 1995.

    Google Scholar 

  7. M. Farach. Optimal suffix tree construction with large alphabets. In Proc. 38th Annu. Symp. Found. Comput. Sci., FOCS 97, 1997. IEEE Press.

    Google Scholar 

  8. D. Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York, NY, 1997.

    Google Scholar 

  9. R. W. Irving, Personal Communication.

    Google Scholar 

  10. S. R. Kosaraju. Computation of squares in a string. In M. Crochemore and D. Gusfield, editors, Combinatorial Pattern Matching: 5th Annual Symposium, CPM 94. Proceedings, number 807 in Lecture Notes in Computer Science, pages 146–150, 1994. Springer Verlag.

    Google Scholar 

  11. G. M. Landau, Personal Communication.

    Google Scholar 

  12. G. M. Landau and J. P. Schmidt. An algorithm for approximate tandem repeats. In A. Apostolico, M. Crochemore, Z. Galil, and U. Manber, editors, Combinatorial Pattern. Matching: 4th Annual Symposium, CPM 93. Proceedings, number 684 in Lecture Notes in Computer Science, pages 120–133, 1993. Springer Verlag.

    Google Scholar 

  13. M. G. Main and R. J. Lorentz. An O (n log n) algorithm for finding all repetitions in a string. J. Algor., 5:422–432, 1984.

    Google Scholar 

  14. M. G. Main and R. J. Lorentz. Linear time recognition of squarefree strings. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume F12 of NATO ASI Series, pages 271–278. Springer Verlag, Berlin, 1985.

    Google Scholar 

  15. U. Manber and E. W. Myers. Suffix arrays: A new method for on-line search. SIAM J. Computing, 22:935–948, 1993.

    Google Scholar 

  16. E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262–272, 1976.

    Google Scholar 

  17. J. P. Schmidt, Personal Communication.

    Google Scholar 

  18. P. F. Stelling. Applications of Combinatorial Analysis to Repetitions in Strings, Phylogeny, and Parallel Multiplier Design. Ph.d. dissertation, Department of Computer Science, University of California, Davis, 1995.

    Google Scholar 

  19. E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249–260, 1995.

    Google Scholar 

  20. P. Weiner. Linear pattern matching algorithms. In IEEE 14th Annual Symposium on Switching and Automata Theory, pages 1–11. IEEE Press, 1973.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Martin Farach-Colton

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Stoye, J., Gusfield, D. (1998). Simple and flexible detection of contiguous repeats using a suffix tree Preliminary Version. In: Farach-Colton, M. (eds) Combinatorial Pattern Matching. CPM 1998. Lecture Notes in Computer Science, vol 1448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030787

Download citation

  • DOI: https://doi.org/10.1007/BFb0030787

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64739-3

  • Online ISBN: 978-3-540-69054-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics