Skip to main content

On Extended Special Factors of a Word

  • Conference paper
  • First Online:
Book cover String Processing and Information Retrieval (SPIRE 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11147))

Included in the following conference series:

Abstract

An extended special factor of a word x is a factor of x whose longest infix can be extended by at least two distinct letters to the left or to the right and still occur in x. It is called extended bispecial if it can be extended in both directions and still occur in x. Let \(\rho (n)\) be the maximum number of extended bispecial factors over all words of length n. Almirantis et al. have shown that \(2n - 6 \le \rho (n) \le 3n-4\) [WABI 2017]. In this article, we show that there is no constant \(c<3\) such that \(\rho (n) \le cn\). We then exploit the connection between extended special factors and minimal absent words to construct a data structure for computing minimal absent words of a specific length in optimal time for integer alphabets generalising a result by Fujishige et al. [MFCS 2016]. As an application of our data structure, we show how to compare two words over an integer alphabet in optimal time improving on another result by Charalampopoulos et al. [Inf. Comput. 2018].

P. Charalampopoulos—Partially supported by a Studentship from the Faculty of Natural and Mathematical Sciences at King’s College London and an A. G. Leventis Foundation Educational Grant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this case, x is called closed. Such words are an object of combinatorial interest [21].

References

  1. Almirantis, Y., et al.: On avoided words, absent words, and their application to biological sequence analysis. Algorithms Mol. Biol. 12(1), 5:1–5:12 (2017)

    Article  MathSciNet  Google Scholar 

  2. Almirantis, Y., et al.: Optimal computation of overabundant words. In: Schwartz, R., Reinert, K. (eds.) 17th International Workshop on Algorithms in Bioinformatics (WABI 2017), Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany, vol. 88, pp. 4:1–4:14. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017)

    Google Scholar 

  3. Barton, C., Héliou, A., Mouchard, L., Pissis, S.P.: Linear-time computation of minimal absent words using suffix array. BMC Bioinform. 15, 388 (2014)

    Article  Google Scholar 

  4. Béal, M.-P., Mignosi, F., Restivo, A.: Minimal forbidden words and symbolic dynamics. In: Puech, C., Reischuk, R. (eds.) STACS 1996. LNCS, vol. 1046, pp. 555–566. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60922-9_45

    Chapter  Google Scholar 

  5. Belazzougui, D., Cunial, F.: A framework for space-efficient string kernels. Algorithmica 79(3), 857–883 (2017)

    Article  MathSciNet  Google Scholar 

  6. Belazzougui, D., Cunial, F., Kärkkäinen, J., Mäkinen, V.: Versatile succinct representations of the bidirectional burrows-wheeler transform. In: Bodlaender, H.L., Italiano, G.F. (eds.) ESA 2013. LNCS, vol. 8125, pp. 133–144. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40450-4_12

    Chapter  Google Scholar 

  7. Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.I.: The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci. 40, 31–55 (1985)

    Article  MathSciNet  Google Scholar 

  8. Carpi, A., de Luca, A.: Special factors, periodicity, and an application to Sturmian words. Acta Inf. 36(12), 983–1006 (2000)

    Article  MathSciNet  Google Scholar 

  9. Carpi, A., de Luca, A.: Words and special factors. Theor. Comput. Sci. 259(1–2), 145–182 (2001)

    Article  MathSciNet  Google Scholar 

  10. Cassaigne, J., Fici, G., Sciortino, M., Zamboni, L.Q.: Cyclic complexity of words. J. Comb. Theory Ser. A 145, 36–56 (2017)

    Article  MathSciNet  Google Scholar 

  11. Chairungsee, S., Crochemore, M.: Using minimal absent words to build phylogeny. Theor. Comput. Sci. 450, 109–116 (2012)

    Article  MathSciNet  Google Scholar 

  12. Charalampopoulos, P., Crochemore, M., Fici, G., Mercaş, R., Pissis, S.P.: Alignment-free sequence comparison using absent words. Inf. Comput. (2018, in Press)

    Google Scholar 

  13. Crochemore, M.: Transducers and repetitions. Theor. Comput. Sci. 45(1), 63–86 (1986)

    Article  MathSciNet  Google Scholar 

  14. Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press, New York (2007)

    Book  Google Scholar 

  15. Crochemore, M., Héliou, A., Kucherov, G., Mouchard, L., Pissis, S.P., Ramusat, Y.: Minimal absent words in a sliding window and applications to on-line pattern matching. In: Klasing, R., Zeitoun, M. (eds.) FCT 2017. LNCS, vol. 10472, pp. 164–176. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-55751-8_14

    Chapter  Google Scholar 

  16. Crochemore, M., Mignosi, F., Restivo, A.: Automata and forbidden words. Inf. Process. Lett. 67(3), 111–117 (1998)

    Article  MathSciNet  Google Scholar 

  17. Crochemore, M., Navarro, G.: Improved antidictionary based compression. In: 22nd International Conference of the Chilean Computer Science Society (SCCC 2002), 6–8 November 2002, Copiapo, Chile, pp. 7–13. IEEE Computer Society (2002)

    Google Scholar 

  18. de Luca, A., Mione, L.: On bispecial factors of the Thue-Morse word. Inf. Process. Lett. 49(4), 179–183 (1994)

    Article  MathSciNet  Google Scholar 

  19. de Luca, A., Varricchio, S.: On the factors of the Thue-Morse word on three symbols. Inf. Process. Lett. 27(6), 281–285 (1988)

    Article  MathSciNet  Google Scholar 

  20. de Luca, A., Varricchio, S.: Some combinatorial properties of the Thue-Morse sequence and a problem in semigroups. Theor. Comput. Sci. 63(3), 333–348 (1989)

    Article  MathSciNet  Google Scholar 

  21. Fici, G.: Open and closed words. Bull. EATCS. 123 (2017). http://eatcs.org/beatcs/index.php/beatcs/article/view/508

  22. Fici, G., Mignosi, F., Restivo, A., Sciortino, M.: Word assembly through minimal forbidden words. Theor. Comput. Sci. 359(1–3), 214–230 (2006)

    Article  MathSciNet  Google Scholar 

  23. Fujishige, Y., Tsujimaru, Y., Inenaga, S., Bannai, H., Takeda, M.: Computing DAWGs and minimal absent words in linear time for integer alphabets. In: Faliszewski, P., Muscholl, A., Niedermeier, R. (eds.) 41st International Symposium on Mathematical Foundations of Computer Science. MFCS 2016, LIPIcs, 22–26 August 2016, Kraków, Poland, vol. 58, pp. 38:1–38:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2016)

    Google Scholar 

  24. Silva, R.M., Pratas, D., Castro, L., Pinho, A.J., Ferreira, P.J.S.G.: Three minimal sequences found in ebola virus genomes and absent from human DNA. Bioinformatics 31(15), 2421–2425 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the financial support towards travel and subsistence from the Laboratoire d’Informatique Gaspard-Monge at the Université Paris-Est, where part of this work has been conducted.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Solon P. Pissis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Charalampopoulos, P., Crochemore, M., Pissis, S.P. (2018). On Extended Special Factors of a Word. In: Gagie, T., Moffat, A., Navarro, G., Cuadros-Vargas, E. (eds) String Processing and Information Retrieval. SPIRE 2018. Lecture Notes in Computer Science(), vol 11147. Springer, Cham. https://doi.org/10.1007/978-3-030-00479-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00479-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00478-1

  • Online ISBN: 978-3-030-00479-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics