Skip to main content

String Attractors and Infinite Words

  • Conference paper
  • First Online:
LATIN 2022: Theoretical Informatics (LATIN 2022)

Abstract

The notion of string attractor has been introduced by Kempa and Prezza (STOC 2018) in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be “attracted”. The smallest size \(\gamma ^*\) of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure \(\gamma ^*\) have been studied in [Mantaci et al., TCS 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating \(\gamma ^*\) on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words in [Schaeffer and Shallit, arXiv 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Allouche, J.P., Shallit, J.: Automatic Sequences: Theory, Applications. Cambridge University Press, Generalizations (2003)

    Book  MATH  Google Scholar 

  2. Béal, M., Perrin, D., Restivo, A.: Decidable problems in substitution shifts. CoRR abs/2112.14499 (2021)

    Google Scholar 

  3. Cassaigne, J.: Sequences with grouped factors. In: Developments in Language Theory, pp. 211–222. Aristotle University of Thessaloniki (1997)

    Google Scholar 

  4. Cassaigne, J., Karhumäki, J.: Toeplitz words, generalized periodicity and periodically iterated morphisms. Eur. J. Comb. 18(5), 497–510 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  5. Castiglione, G., Restivo, A., Sciortino, M.: Circular Sturmian words and Hopcroft’s algorithm. Theor. Comput. Sci. 410(43), 4372–4381 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. Castiglione, G., Restivo, A., Sciortino, M.: On extremal cases of Hopcroft’s algorithm. Theor. Comput. Sci. 411(38–39), 3414–3422 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  7. Castiglione, G., Restivo, A., Sciortino, M.: Hopcroft’s algorithm and cyclic automata. In: Martín-Vide, C., Otto, F., Fernau, H. (eds.) LATA 2008. LNCS, vol. 5196, pp. 172–183. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88282-4_17

    Chapter  MATH  Google Scholar 

  8. Christiansen, A.R., Ettienne, M.B., Kociumaka, T., Navarro, G., Prezza, N.: Optimal-time dictionary-compressed indexes. ACM Trans. Algorithms 17(1), 8:1-8:39 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  9. Constantinescu, S., Ilie, L.: The Lempel-Ziv complexity of fixed points of morphisms. SIAM J. Discret. Math. 21(2), 466–481 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  10. Damanik, D., Lenz, D.: Substitution dynamical systems: characterization of linear repetitivity and applications. J. Math. Anal. Appl. 321(2), 766–780 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  11. Durand, F., Perrin, D.: Dimension Groups and Dynamical Systems: Substitutions, Bratteli Diagrams and Cantor Systems. Cambridge Studies in Advanced Mathematics, Cambridge University Press (2022)

    Google Scholar 

  12. Frosini, A., Mancini, I., Rinaldi, S., Romana, G., Sciortino, M.: Logarithmic equal-letter runs for BWT of purely morphic words. In: Diekert, V., Volkov, M. (eds.) Developments in Language Theory. Lecture Notes in Computer Science, vol. 13257, pp. 139–151. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-05578-2_11

    Chapter  Google Scholar 

  13. Heinis, A.: Languages under substitutions and balanced words. Journal de Théorie des Nombres de Bordeaux 16, 151–172 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  14. Kempa, D., Prezza, N.: At the roots of dictionary compression: string attractors. In: STOC 2018, pp. 827–840. ACM (2018)

    Google Scholar 

  15. Knuth, D., Morris, J., Pratt, V.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kociumaka, T., Navarro, G., Prezza, N.: Towards a definitive measure of repetitiveness. In: Kohayakawa, Y., Miyazawa, F.K. (eds.) LATIN 2021. LNCS, vol. 12118, pp. 207–219. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61792-9_17

    Chapter  Google Scholar 

  17. Kutsukake, K., Matsumoto, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: On repetitiveness measures of Thue-Morse words. In: Boucher, C., Thankachan, S.V. (eds.) SPIRE 2020. LNCS, vol. 12303, pp. 213–220. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59212-7_15

    Chapter  Google Scholar 

  18. Lempel, A., Ziv, J.: On the complexity of finite sequences. IEEE T. Inform. Theory 22(1), 75–81 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  19. Lothaire, M.: Algebraic Combinatorics on Words. Cambridge University Press, Cambridge (2002)

    Book  MATH  Google Scholar 

  20. Mantaci, S., Restivo, A., Sciortino, M.: Burrows-Wheeler transform and Sturmian words. Inform. Process. Lett. 86, 241–246 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  21. Mantaci, S., Restivo, A., Romana, G., Rosone, G., Sciortino, M.: A combinatorial view on string attractors. Theor. Comput. Sci. 850, 236–248 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  22. Navarro, G.: Indexing highly repetitive string collections, part I: repetitiveness measures. ACM Comput. Surv. 54(2), 29:1-29:31 (2021)

    Google Scholar 

  23. Navarro, G.: Indexing highly repetitive string collections, part II: compressed indexes. ACM Comput. Surv. 54(2), 26:1-26:32 (2021)

    Google Scholar 

  24. Navarro, G.: The compression power of the BWT: technical perspective. Commun. ACM 65(6), 90 (2022)

    Article  Google Scholar 

  25. Pansiot, J.-J.: Complexité des facteurs des mots infinis engendrés par morphismes itérés. In: Paredaens, J. (ed.) ICALP 1984. LNCS, vol. 172, pp. 380–389. Springer, Heidelberg (1984). https://doi.org/10.1007/3-540-13345-3_34

    Chapter  Google Scholar 

  26. Schaeffer, L., Shallit, J.: String attractors for automatic sequences. CoRR abs/2012.06840 (2021)

    Google Scholar 

  27. Sciortino, M., Zamboni, L.Q.: Suffix automata and standard Sturmian words. In: Harju, T., Karhumäki, J., Lepistö, A. (eds.) DLT 2007. LNCS, vol. 4588, pp. 382–398. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73208-2_36

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marinella Sciortino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Restivo, A., Romana, G., Sciortino, M. (2022). String Attractors and Infinite Words. In: Castañeda, A., Rodríguez-Henríquez, F. (eds) LATIN 2022: Theoretical Informatics. LATIN 2022. Lecture Notes in Computer Science, vol 13568. Springer, Cham. https://doi.org/10.1007/978-3-031-20624-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20624-5_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20623-8

  • Online ISBN: 978-3-031-20624-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics