Skip to main content
Log in

Double String Tandem Repeats

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

A tandem repeat is an occurrence of two adjacent identical substrings. In this paper, we introduce the notion of a double string, which consists of two parallel strings, and we study the problem of locating all tandem repeats in a double string. The problem introduced here has applications beyond actual double strings, as we illustrate by solving two different problems with the algorithm of the double string tandem repeats problem. The first problem is that of finding all corner-sharing tandems in a 2-dimensional text, defined by Apostolico and Brimkov. The second problem is that of finding all scaled tandem repeats in a 1d text, where a scaled tandem repeat is defined as a string \(UU'\) such that \(U'\) is discrete scale of U. In addition to the algorithms for exact tandem repeats, we also present algorithms that solve the problem in the inexact sense, allowing up to k mismatches. We believe that this framework will open a new perspective for other problems in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. In DNA there are specific relationships between corresponding bases, while our definition of a double string does not imply any such relationship.

References

  1. Amir, A., Butman, A., Lewenstein, M.: Real scaled matching. Inf. Process. Lett. 70(4), 185–190 (1999)

    Article  MATH  Google Scholar 

  2. Apostolico, A., Brimkov, V.E.: Optimal discovery of repetitions in 2d. Discret. Appl. Math. 151(1–3), 5–20 (2005)

    Article  MATH  Google Scholar 

  3. Butman, A., Eres, R., Landau, G.M.: Scaled and permuted string matching. Inf. Process. Lett. 92(6), 293–297 (2004)

    Article  MATH  Google Scholar 

  4. Crochemore, M., Ilie, L., Rytter, W.: Repetitions in strings: Algorithms and combinatorics. Theoretical Computer Science, 410(50):5227 – 5235 (2009). Mathematical Foundations of Computer Science (MFCS 2007)

  5. Galil, Z., Giancarlo, R.: Improved string matching with \(k\) mismatches. SIGACT News 17(4), 52–54 (1986)

    Article  MATH  Google Scholar 

  6. Geizhals, S.H., Sokol, D.: Finding maximal 2-dimensional palindromes. Inf. Comput. 266, 161–172 (2019)

    Article  MATH  Google Scholar 

  7. Gusfield, D.: Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  8. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)

    Article  MATH  Google Scholar 

  9. Iliopoulos, C.S., Moore, D., Smyth, W.F.: A characterization of the squares in a fibonacci string. Theoret. Comput. Sci. 172(1), 281–291 (1997)

    Article  MATH  Google Scholar 

  10. Karp, R. M., Miller, R. E., Rosenberg, A. L.: Rapid identification of repeated patterns in strings, trees and arrays. In: Proceedings of the 4th Annual ACM Symposium on Theory of Computing (STOC), pp. 125–136 (1972)

  11. Knuth, D.E., Morris, J.H., Jr., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)

    Article  MATH  Google Scholar 

  12. Kolpakov, R. M., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: 40th Annual Symposium on Foundations of Computer Science, FOCS ’99, 17-18 October, 1999, New York, NY, USA, pp. 596–604. IEEE Computer Society (1999)

  13. Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8, 1–18 (2001)

    Article  Google Scholar 

  14. Landau, G.M., Vishkin, U.: Fast string matching with k differences. J. Comput. Syst. Sci. 37(1), 63–78 (1988)

    Article  MATH  Google Scholar 

  15. Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10(2), 157–169 (1989)

    Article  MATH  Google Scholar 

  16. Liu, J.J., Huang, G.S., Wang, Y.L.: A fast algorithm for finding the positions of all squares in a run-length encoded string. Theoret. Comput. Sci. 410(38), 3942–3948 (2009)

    Article  MATH  Google Scholar 

  17. Main, M.G., Lorentz, R.J.: An O(n log n) algorithm for finding all repetitions in a string. J. Algorithms 5(3), 422–432 (1984)

    Article  MATH  Google Scholar 

  18. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MATH  Google Scholar 

Download references

Funding

The authors A. Amir and G. M. Landau have been partially supported by Grant No. 2018141 from the United States-Israel Binational Science Foundation (BSF) and Israel Science Foundation Grant 1475-18. D. Sokol was also partially supported by BSF Grant No. 2018141. S. Marcus was partially supported by the Professional Staff Congress City University of New York Research Award 63164-00 51.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dina Sokol.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amir, A., Butman, A., Landau, G.M. et al. Double String Tandem Repeats. Algorithmica 85, 170–187 (2023). https://doi.org/10.1007/s00453-022-01016-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-022-01016-9

Keywords

Navigation