Abstract
A substring u of a string T is called a minimal unique substring (MUS) of T if u occurs exactly once in T and any proper substring of u occurs at least twice in T. A string w is called a minimal absent word (MAW) of T if w does not occur in T and any proper substring of w occurs in T. In this paper, we study the problems of computing MUSs and MAWs in a sliding window over a given string T. We first show how the set of MUSs can change in a sliding window over T, and present an \(O(n\log \sigma )\)-time and O(d)-space algorithm to compute MUSs in a sliding window of width d over T, where \(\sigma \) is the maximum number of distinct characters in every window. We then give tight upper and lower bounds on the maximum number of changes in the set of MAWs in a sliding window over T. Our bounds improve on the previous results in Crochemore et al. (2017).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
At least one of w[2..] and \(w[..|w|\, - \,1]\) is absent from T[i..j], because \(w \not \in \mathsf {MAW}\) (T[i..j]).
References
Chairungsee, S., Crochemore, M.: Using minimal absent words to build phylogeny. Theor. Comput. Sci. 450, 109–116 (2012)
Cleary, J.G., Witten, I.H.: Data compression using adaptive coding and partial string matching. IEEE Trans. Commun. 32(4), 396–402 (1984)
Crochemore, M., Mignosi, F., Restivo, A., Salemi, S.: Data compression using antidictionaries. Proc. IEEE 88(11), 1756–1768 (2000)
Crochemore, M., Héliou, A., Kucherov, G., Mouchard, L., Pissis, S.P., Ramusat, Y.: Minimal absent words in a sliding window and applications to on-line pattern matching. In: Klasing, R., Zeitoun, M. (eds.) FCT 2017. LNCS, vol. 10472, pp. 164–176. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-55751-8_14
Gräf, S.: Optimized design and assessment of whole genome tiling arrays. Bioinformatics 23(13), i195–i204 (2007)
Hu, X., Pei, J., Tao, Y.: Shortest unique queries on strings. In: Moura, E., Crochemore, M. (eds.) SPIRE 2014. LNCS, vol. 8799, pp. 161–172. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11918-2_16
Larsson, N.J.: Extended application of suffix trees to data compression. In: DCC 1996, pp. 190–199 (1996)
Mieno, T., Inenaga, S., Bannai, H., Takeda, M.: Shortest unique substring queries on run-length encoded strings. In: MFCS 2016, pp. 69:1–69:11 (2016)
Mieno, T., et al.: Minimal unique substrings and minimal absent words in a sliding window. CoRR abs/1909.02804 (2019)
Ota, T., Fukae, H., Morita, H.: Dynamic construction of an antidictionary with linear complexity. Theor. Comput. Sci. 526, 108–119 (2014)
Pei, J., Wu, W.C., Yeh, M.: On shortest unique substring queries. In: ICDE 2013, pp. 937–948 (2013)
Senft, M.: Suffix tree for a sliding window: an overview. In: WDS, vol. 5, pp. 41–46 (2005)
Silva, R.M., Pratas, D., Castro, L., Pinho, A.J., Ferreira, P.J.S.G.: Three minimal sequences found in Ebola virus genomes and absent from human DNA. Bioinformatics 31(15), 2421–2425 (2015)
Tsuruta, K., Inenaga, S., Bannai, H., Takeda, M.: Shortest unique substrings queries in optimal time. In: Geffert, V., Preneel, B., Rovan, B., Štuller, J., Tjoa, A.M. (eds.) SOFSEM 2014. LNCS, vol. 8327, pp. 503–513. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04298-5_44
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory IT–23(3), 337–349 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mieno, T. et al. (2020). Minimal Unique Substrings and Minimal Absent Words in a Sliding Window. In: Chatzigeorgiou, A., et al. SOFSEM 2020: Theory and Practice of Computer Science. SOFSEM 2020. Lecture Notes in Computer Science(), vol 12011. Springer, Cham. https://doi.org/10.1007/978-3-030-38919-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-38919-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38918-5
Online ISBN: 978-3-030-38919-2
eBook Packages: Computer ScienceComputer Science (R0)