Skip to main content

Beating \(\mathcal{O}(nm)\) in Approximate LZW-Compressed Pattern Matching

  • Conference paper
Algorithms and Computation (ISAAC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8283))

Included in the following conference series:

  • 1625 Accesses

Abstract

Given an LZW/LZ78 compressed text, we want to find an approximate occurrence of a given pattern of length m. The goal is to achieve time complexity depending on the size n of the compressed representation of the text instead of its length. We consider two specific definitions of approximate matching, namely the Hamming distance and the edit distance, and show how to achieve \(\mathcal{O}(n\sqrt{m}k^{2})\) and \(\mathcal{O}(n\sqrt{m}k^{3})\) running time, respectively, where k is the bound on the distance, both in linear space. Even for very small values of k, the best previously known solutions required Ω(nm) time. Our main contribution is applying a periodicity-based argument in a way that is computationally effective even if we operate on a compressed representation of a string, while the previous solutions were either based on a dynamic programming, or a black-box application of tools developed for uncompressed strings.

Supported by NCN grant 2011/01/D/ST6/07164, 2011–2014.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amir, A., Benson, G., Farach, M.: Let sleeping files lie: Pattern matching in Z-compressed files. J. Comput. Syst. Sci. 52(2), 299–307 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  2. Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms 50(2), 257–275 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bille, P., Fagerberg, R., Gørtz, I.L.: Improved approximate string matching and regular expression matching on Ziv-Lempel compressed texts. ACM Transactions on Algorithms 6(1) (2009)

    Google Scholar 

  4. Cole, R., Hariharan, R.: Approximate string matching: A simpler faster algorithm. SIAM J. Comput. 31(6), 1761–1782 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  5. Crochemore, M., Rytter, W.: Jewels of stringology. World Scientific (2002)

    Google Scholar 

  6. Gawrychowski, P.: Optimal pattern matching in LZW compressed strings. In: Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2011, pp. 362–372. SIAM (2011)

    Google Scholar 

  7. Gawrychowski, P.: Simple and efficient LZW-compressed multiple pattern matching. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 232–242. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Gawrychowski, P.: Tying up the loose ends in fully LZW-compressed pattern matching. In: Dürr, C., Wilke, T. (eds.) STACS. LIPIcs, vol. 14, pp. 624–635. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2012)

    Google Scholar 

  9. Kärkkäinen, J., Navarro, G., Ukkonen, E.: Approximate string matching on Ziv-Lempel compressed text. J. Discrete Algorithms 1(3-4), 313–338 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  10. Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theor. Comput. Sci. 43, 239–249 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  11. Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10(2), 157–169 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  12. Welch, T.A.: A technique for high-performance data compression. Computer 17(6), 8–19 (1984)

    Article  Google Scholar 

  13. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gawrychowski, P., Straszak, D. (2013). Beating \(\mathcal{O}(nm)\) in Approximate LZW-Compressed Pattern Matching. In: Cai, L., Cheng, SW., Lam, TW. (eds) Algorithms and Computation. ISAAC 2013. Lecture Notes in Computer Science, vol 8283. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45030-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45030-3_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45029-7

  • Online ISBN: 978-3-642-45030-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics