Skip to main content

Minimal Unique Substrings and Minimal Absent Words in a Sliding Window

  • Conference paper
  • First Online:
SOFSEM 2020: Theory and Practice of Computer Science (SOFSEM 2020)

Abstract

A substring u of a string T is called a minimal unique substring (MUS) of T if u occurs exactly once in T and any proper substring of u occurs at least twice in T. A string w is called a minimal absent word (MAW) of T if w does not occur in T and any proper substring of w occurs in T. In this paper, we study the problems of computing MUSs and MAWs in a sliding window over a given string T. We first show how the set of MUSs can change in a sliding window over T, and present an \(O(n\log \sigma )\)-time and O(d)-space algorithm to compute MUSs in a sliding window of width d over T, where \(\sigma \) is the maximum number of distinct characters in every window. We then give tight upper and lower bounds on the maximum number of changes in the set of MAWs in a sliding window over T. Our bounds improve on the previous results in Crochemore et al. (2017).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    At least one of w[2..] and \(w[..|w|\, - \,1]\) is absent from T[i..j], because \(w \not \in \mathsf {MAW}\) (T[i..j]).

References

  1. Chairungsee, S., Crochemore, M.: Using minimal absent words to build phylogeny. Theor. Comput. Sci. 450, 109–116 (2012)

    Article  MathSciNet  Google Scholar 

  2. Cleary, J.G., Witten, I.H.: Data compression using adaptive coding and partial string matching. IEEE Trans. Commun. 32(4), 396–402 (1984)

    Article  Google Scholar 

  3. Crochemore, M., Mignosi, F., Restivo, A., Salemi, S.: Data compression using antidictionaries. Proc. IEEE 88(11), 1756–1768 (2000)

    Article  Google Scholar 

  4. Crochemore, M., Héliou, A., Kucherov, G., Mouchard, L., Pissis, S.P., Ramusat, Y.: Minimal absent words in a sliding window and applications to on-line pattern matching. In: Klasing, R., Zeitoun, M. (eds.) FCT 2017. LNCS, vol. 10472, pp. 164–176. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-55751-8_14

    Chapter  Google Scholar 

  5. Gräf, S.: Optimized design and assessment of whole genome tiling arrays. Bioinformatics 23(13), i195–i204 (2007)

    Article  Google Scholar 

  6. Hu, X., Pei, J., Tao, Y.: Shortest unique queries on strings. In: Moura, E., Crochemore, M. (eds.) SPIRE 2014. LNCS, vol. 8799, pp. 161–172. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11918-2_16

    Chapter  Google Scholar 

  7. Larsson, N.J.: Extended application of suffix trees to data compression. In: DCC 1996, pp. 190–199 (1996)

    Google Scholar 

  8. Mieno, T., Inenaga, S., Bannai, H., Takeda, M.: Shortest unique substring queries on run-length encoded strings. In: MFCS 2016, pp. 69:1–69:11 (2016)

    Google Scholar 

  9. Mieno, T., et al.: Minimal unique substrings and minimal absent words in a sliding window. CoRR abs/1909.02804 (2019)

    Google Scholar 

  10. Ota, T., Fukae, H., Morita, H.: Dynamic construction of an antidictionary with linear complexity. Theor. Comput. Sci. 526, 108–119 (2014)

    Article  MathSciNet  Google Scholar 

  11. Pei, J., Wu, W.C., Yeh, M.: On shortest unique substring queries. In: ICDE 2013, pp. 937–948 (2013)

    Google Scholar 

  12. Senft, M.: Suffix tree for a sliding window: an overview. In: WDS, vol. 5, pp. 41–46 (2005)

    Google Scholar 

  13. Silva, R.M., Pratas, D., Castro, L., Pinho, A.J., Ferreira, P.J.S.G.: Three minimal sequences found in Ebola virus genomes and absent from human DNA. Bioinformatics 31(15), 2421–2425 (2015)

    Article  Google Scholar 

  14. Tsuruta, K., Inenaga, S., Bannai, H., Takeda, M.: Shortest unique substrings queries in optimal time. In: Geffert, V., Preneel, B., Rovan, B., Štuller, J., Tjoa, A.M. (eds.) SOFSEM 2014. LNCS, vol. 8327, pp. 503–513. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04298-5_44

    Chapter  Google Scholar 

  15. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MathSciNet  Google Scholar 

  16. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory IT–23(3), 337–349 (1977)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takuya Mieno .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mieno, T. et al. (2020). Minimal Unique Substrings and Minimal Absent Words in a Sliding Window. In: Chatzigeorgiou, A., et al. SOFSEM 2020: Theory and Practice of Computer Science. SOFSEM 2020. Lecture Notes in Computer Science(), vol 12011. Springer, Cham. https://doi.org/10.1007/978-3-030-38919-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-38919-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-38918-5

  • Online ISBN: 978-3-030-38919-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics