Skip to main content

Advertisement

Log in

Factorizing Strings into Repetitions

  • Published:
Theory of Computing Systems Aims and scope Submit manuscript

Abstract

A factorization f1,…,fm of a string w is called a repetition factorization of w if each factor fi is a repetition, namely, \(f_{i} = x^{k}x^{\prime }\) for some non-empty string x, an integer k ≥ 2, and \(x^{\prime }\) being a proper prefix of x. Dumitran et al. (Proc. SPIRE 2015) proposed an algorithm which computes an arbitrary repetition factorization of a given string w in O(n) time, where n is the length of w. The number of factors (i.e. repetitions) contained in the output of their algorithm is not known or guaranteed. In this paper, we propose two algorithms for computing smallest/largest repetition factorizations in \(O(n \log n)\) time, which respectively consist of the smallest/largest number of factors. The first algorithm is a simple \(O(n \log n)\)-space algorithm based on a reduction of the problem to the shortest/longest path problem on the DAG of size \(O(n \log n)\). The second one simulates the dynamic programming algorithm for shortest/longest path problem within O(n) space based on the idea of the first algorithm. Moreover, we discuss combinatorial structures of smallest/largest repetition factorizations of Fibonacci strings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. A preliminary version of this paper appeared as [16].

References

  1. Badkobeh, G., Bannai, H., Goto, K., I, T., Iliopoulos, C.S., Inenaga, S., Puglisi, S.J., Sugimoto, S.: Closed factorization. Discret. Appl. Math. 212, 23–29 (2016)

    Article  MathSciNet  Google Scholar 

  2. Bannai, H., I, T., Inenaga, S., Nakashima, Y., Takeda, M., Tsuruta, K.: The “runs” theorem. SIAM J. Comput. 46(5), 1501–1514 (2017)

    Article  MathSciNet  Google Scholar 

  3. Bannai, H., Gagie, T., Inenaga, S., Kärkkäinen, J., Kempa, D., Piatkowski, M., Sugimoto, S.: Diverse palindromic factorization is NP-complete. Int. J. Found. Comput. Sci. 29(2), 143–164 (2018)

    Article  MathSciNet  Google Scholar 

  4. Borozdin, K., Kosolobov, D., Rubinchik, M., Shur, A.M.: Palindromic Length in Linear Time. In: CPM 2017, pp. 23:1–23:12 (2017)

  5. Chen, K.T., Fox, R.H., Lyndon, R.C.: Free differential calculus. iv. The quotient groups of the lower central series. Ann. Math. 68(1), 81–95 (1958)

    Article  MathSciNet  Google Scholar 

  6. Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Inf. Process. Lett. 106(2), 75–80 (2008). https://doi.org/10.1016/j.ipl.2007.10.006

    Article  MathSciNet  Google Scholar 

  7. Crochemore, M., Rytter, W.: Sqares, cubes, and time-space efficient string searching. Algorithmica 13(5), 405–425 (1995)

    Article  MathSciNet  Google Scholar 

  8. Crochemore, M., Ilie, L., Smyth, W.F.: A simple algorithm for computing the Lempel Ziv factorization. In: Proceedings of DCC 2008, pp. 482–488 (2008)

  9. Cummings, L.J., Moore, D., Karhumäki, J.: Borders of Fibonacci strings. J. Comb. Math. Comb. Comput. 20, 81–88 (1996)

    MathSciNet  MATH  Google Scholar 

  10. Dumitran, M., Manea, F., Nowotka, D.: On prefix/suffix-square free words. In: Proceedings of SPIRE, pp. 54–66 (2015)

  11. Duval, J.: Factorizing words over an ordered alphabet. J. Algorithms 4(4), 363–381 (1983)

    Article  MathSciNet  Google Scholar 

  12. Ellert, J., Fischer, J.: Linear time runs over general ordered alphabets. In: Bansal, N., Merelli, E., Worrell, J. (eds.) 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12–16, 2021, Glasgow, Scotland (Virtual Conference), LIPIcs, vol. 198, pp. 63:1–63:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)

  13. Fici, G., Gagie, T., Kärkkäinen, J., Kempa, D.: A subquadratic algorithm for minimum palindromic factorization. J. Discrete Algorithms 28, 41–48 (2014)

    Article  MathSciNet  Google Scholar 

  14. Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proc. Am. Math. Soc. 16(1), 109–114 (1965)

    Article  MathSciNet  Google Scholar 

  15. Fraenkel, A.S., Simpson, J.: The exact number of squares in fibonacci words. Theor. Comput. Sci. 218(1), 95–106 (1999)

    Article  MathSciNet  Google Scholar 

  16. Inoue, H., Matsuoka, Y., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Computing smallest and largest repetition factorizations in O(n log n) time. In: Holub, J., Zdárek, J. (eds.) Proceedings of the Prague Stringology Conference 2016, Prague, Czech Republic, August 29–31, 2016, pp. 135–145 (2016)

  17. Kolpakov, R.M., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proceedings of FOCS 1999, pp. 596–604 (1999)

  18. Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N.: Searching of gapped repeats and subrepetitions in a word. J. Discrete Algorithms 46–47, 1–15 (2017)

    Article  MathSciNet  Google Scholar 

  19. Kufleitner, M.: On bijective variants of the Burrows-Wheeler transform. In: Proceedings of PSC 2009, pp. 65–79 (2009)

  20. Lothaire, M.: Combinatorics on Words. Addison-Wesley, Reading (1983)

  21. Matsuoka, Y., Inenaga, S., Bannai, H., Takeda, M., Manea, F.: Factorizing a string into squares in linear time. In: Proceedings of CPM 2016, pp. 27:1–27:12 (2016)

  22. Nakashima, Y., Takagi, T., Inenaga, S., Bannai, H., Takeda, M.: Constructing LZ78 tries and position heaps in linear time for large alphabets. Inf. Process. Lett. 115(9), 655–659 (2015)

    Article  MathSciNet  Google Scholar 

  23. Storer, J., Szymanski, T.: Data compression via textual substitution. J. ACM 29(4), 928–951 (1982)

    Article  MathSciNet  Google Scholar 

  24. Tanimura, Y., Fujishige, Y., I, T., Inenaga, S., Bannai, H., Takeda, M.: A faster algorithm for computing maximal α-gapped repeats in a string. In: Proceedings of SPIRE 2015, pp. 124–136 (2015)

  25. I, T., Sugimoto, S., Inenaga, S., Bannai, H., Takeda, M.: Computing palindromic factorizations and palindromic covers on-line. In: Proceedings of CPM 2014, pp. 150–161 (2014)

  26. Welch, T.A.: A technique for high performance data compression. IEEE Comput. 17, 8–19 (1984)

    Article  Google Scholar 

  27. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory IT-23(3), 337–349 (1977)

    Article  MathSciNet  Google Scholar 

  28. Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Numbers 25240003, 26280003, JP16H02783, JP17H01697, JP18K18002, JP18H04098, and by JST PRESTO Grant Number JPMJPR1922.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuto Nakashima.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Inoue, H., Matsuoka, Y., Nakashima, Y. et al. Factorizing Strings into Repetitions. Theory Comput Syst 66, 484–501 (2022). https://doi.org/10.1007/s00224-022-10070-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00224-022-10070-3

Keywords

Navigation