Abstract
An extended special factor of a word x is a factor of x whose longest infix can be extended by at least two distinct letters to the left or to the right and still occur in x. It is called extended bispecial if it can be extended in both directions and still occur in x. Let \(\rho (n)\) be the maximum number of extended bispecial factors over all words of length n. Almirantis et al. have shown that \(2n - 6 \le \rho (n) \le 3n-4\) [WABI 2017]. In this article, we show that there is no constant \(c<3\) such that \(\rho (n) \le cn\). We then exploit the connection between extended special factors and minimal absent words to construct a data structure for computing minimal absent words of a specific length in optimal time for integer alphabets generalising a result by Fujishige et al. [MFCS 2016]. As an application of our data structure, we show how to compare two words over an integer alphabet in optimal time improving on another result by Charalampopoulos et al. [Inf. Comput. 2018].
P. Charalampopoulos—Partially supported by a Studentship from the Faculty of Natural and Mathematical Sciences at King’s College London and an A. G. Leventis Foundation Educational Grant.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this case, x is called closed. Such words are an object of combinatorial interest [21].
References
Almirantis, Y., et al.: On avoided words, absent words, and their application to biological sequence analysis. Algorithms Mol. Biol. 12(1), 5:1–5:12 (2017)
Almirantis, Y., et al.: Optimal computation of overabundant words. In: Schwartz, R., Reinert, K. (eds.) 17th International Workshop on Algorithms in Bioinformatics (WABI 2017), Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany, vol. 88, pp. 4:1–4:14. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017)
Barton, C., Héliou, A., Mouchard, L., Pissis, S.P.: Linear-time computation of minimal absent words using suffix array. BMC Bioinform. 15, 388 (2014)
Béal, M.-P., Mignosi, F., Restivo, A.: Minimal forbidden words and symbolic dynamics. In: Puech, C., Reischuk, R. (eds.) STACS 1996. LNCS, vol. 1046, pp. 555–566. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60922-9_45
Belazzougui, D., Cunial, F.: A framework for space-efficient string kernels. Algorithmica 79(3), 857–883 (2017)
Belazzougui, D., Cunial, F., Kärkkäinen, J., Mäkinen, V.: Versatile succinct representations of the bidirectional burrows-wheeler transform. In: Bodlaender, H.L., Italiano, G.F. (eds.) ESA 2013. LNCS, vol. 8125, pp. 133–144. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40450-4_12
Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.I.: The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci. 40, 31–55 (1985)
Carpi, A., de Luca, A.: Special factors, periodicity, and an application to Sturmian words. Acta Inf. 36(12), 983–1006 (2000)
Carpi, A., de Luca, A.: Words and special factors. Theor. Comput. Sci. 259(1–2), 145–182 (2001)
Cassaigne, J., Fici, G., Sciortino, M., Zamboni, L.Q.: Cyclic complexity of words. J. Comb. Theory Ser. A 145, 36–56 (2017)
Chairungsee, S., Crochemore, M.: Using minimal absent words to build phylogeny. Theor. Comput. Sci. 450, 109–116 (2012)
Charalampopoulos, P., Crochemore, M., Fici, G., Mercaş, R., Pissis, S.P.: Alignment-free sequence comparison using absent words. Inf. Comput. (2018, in Press)
Crochemore, M.: Transducers and repetitions. Theor. Comput. Sci. 45(1), 63–86 (1986)
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press, New York (2007)
Crochemore, M., Héliou, A., Kucherov, G., Mouchard, L., Pissis, S.P., Ramusat, Y.: Minimal absent words in a sliding window and applications to on-line pattern matching. In: Klasing, R., Zeitoun, M. (eds.) FCT 2017. LNCS, vol. 10472, pp. 164–176. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-55751-8_14
Crochemore, M., Mignosi, F., Restivo, A.: Automata and forbidden words. Inf. Process. Lett. 67(3), 111–117 (1998)
Crochemore, M., Navarro, G.: Improved antidictionary based compression. In: 22nd International Conference of the Chilean Computer Science Society (SCCC 2002), 6–8 November 2002, Copiapo, Chile, pp. 7–13. IEEE Computer Society (2002)
de Luca, A., Mione, L.: On bispecial factors of the Thue-Morse word. Inf. Process. Lett. 49(4), 179–183 (1994)
de Luca, A., Varricchio, S.: On the factors of the Thue-Morse word on three symbols. Inf. Process. Lett. 27(6), 281–285 (1988)
de Luca, A., Varricchio, S.: Some combinatorial properties of the Thue-Morse sequence and a problem in semigroups. Theor. Comput. Sci. 63(3), 333–348 (1989)
Fici, G.: Open and closed words. Bull. EATCS. 123 (2017). http://eatcs.org/beatcs/index.php/beatcs/article/view/508
Fici, G., Mignosi, F., Restivo, A., Sciortino, M.: Word assembly through minimal forbidden words. Theor. Comput. Sci. 359(1–3), 214–230 (2006)
Fujishige, Y., Tsujimaru, Y., Inenaga, S., Bannai, H., Takeda, M.: Computing DAWGs and minimal absent words in linear time for integer alphabets. In: Faliszewski, P., Muscholl, A., Niedermeier, R. (eds.) 41st International Symposium on Mathematical Foundations of Computer Science. MFCS 2016, LIPIcs, 22–26 August 2016, Kraków, Poland, vol. 58, pp. 38:1–38:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2016)
Silva, R.M., Pratas, D., Castro, L., Pinho, A.J., Ferreira, P.J.S.G.: Three minimal sequences found in ebola virus genomes and absent from human DNA. Bioinformatics 31(15), 2421–2425 (2015)
Acknowledgements
The authors would like to acknowledge the financial support towards travel and subsistence from the Laboratoire d’Informatique Gaspard-Monge at the Université Paris-Est, where part of this work has been conducted.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Charalampopoulos, P., Crochemore, M., Pissis, S.P. (2018). On Extended Special Factors of a Word. In: Gagie, T., Moffat, A., Navarro, G., Cuadros-Vargas, E. (eds) String Processing and Information Retrieval. SPIRE 2018. Lecture Notes in Computer Science(), vol 11147. Springer, Cham. https://doi.org/10.1007/978-3-030-00479-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-00479-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00478-1
Online ISBN: 978-3-030-00479-8
eBook Packages: Computer ScienceComputer Science (R0)