Skip to main content

Time/Space Efficient Compressed Pattern Matching

  • Conference paper
  • First Online:
Fundamentals of Computation Theory (FCT 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2138))

Included in the following conference series:

Abstract

An exact pattern matching problem is to find all occurrences of a pattern p in a text t. We say that the pattern matching algorithm is optimal if its running time is linear in the sizes of t and p, i.e. O(t + p). Perhaps one of the most interesting settings of the pattern matching problem is when one has to design an efficient algorithm with a help of small extra space. In this paper we explore this setting to the extreme. We use an additional assumption that the text t is available only in a compressed form, represented by a straight-line program. The compression methods based on efficient construction of straight-line programs are as competitive as the compression standards, including Lempel-Ziv’s compression scheme and recently intensively studied compression via block sorting, due to Burrows and Wheeler. Our main result consists in solving compressed string matching problem in optimal linear time when only a constant size of extra space is available. We also discuss an efficient implementation of a version our algorithm showing that the new concept may have also interesting real applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Amir, G. Benson, and M. Farach, Let sleeping files lie: Pattern matching in Z-compressed files, Proc. of 5th Annual ACM-SIAM Symposium on Discrete Algorithms, January 1994.

    Google Scholar 

  2. A. Amir, G.M. Landau, and D. Sokol, Inplace Run-Length 2d Compressed Search, In Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’2000, San Francisco, pp. 817–818.

    Google Scholar 

  3. D. Breslauer. Saving comparisons in the Crochemore-Perrin string matching algorithm. Theoretical Computer Science, 158(1–2):177–192, May 1996.

    Google Scholar 

  4. M. Crochemore and D. Perrin. Two-way string-matching. Journal of the ACM, 38(3):651–675, July 1991.

    Google Scholar 

  5. M. Crochemore, W. Rytter, Text algorithms, Oxford University Press, 1994.

    Google Scholar 

  6. M. Farach and M. Thorup, String Matching in Lempel-Ziv Compressed Strings, Proc. 27th ACM Symposium on Theory of Computing, pp. 703–713, 1994.

    Google Scholar 

  7. P. Ferragina and G. Manzini, Opportunistic Data Structures with Applications. Proc. 41st IEEE Symposium on Foundations of Computer Science, (FOCS’00). Redondo Beach (CA), 2000, pp. 390–398.

    Google Scholar 

  8. Z. Galil and J. Seiferas. Time-space-optimal string matching. Journal of Computer and System Sciences, 26(3):280–294, June 1983.

    Google Scholar 

  9. L. Gcasieniec, W. Plandowski, and W. Rytter. The zooming method: a recursive approach to time-space efficient string-matching. Theoretical Computer Science, 147(1–2):19–30, August 1995

    Google Scholar 

  10. L. Gcasieniec, W. Plandowski, and W. Rytter. Constant-space string matching with smaller number of comparisons: Sequential sampling. In Proc. of 6th Combinatorial Pattern Matching, LNCS 937, pages 78–89, Espoo, Finland, July 5–7, 1995.

    Google Scholar 

  11. L. Gcasieniec and W. Rytter. Almost optimal fully compressed pattern matching. In Proceedings of Data Compression Conference (DCC’99), Snowbird, March 1999.

    Google Scholar 

  12. J.C. Kieffer, A Survey of Advances in Hierarchical Data Compression, Technical Report, Department of Electrical & Computer Engineering, University of Minnesota, 2000.

    Google Scholar 

  13. D. Knuth, J. Morris, and V. Pratt, Fast pattern matching in strings, SIAM J. on Computing, 6 (1977), pp. 323–360.

    Article  MATH  MathSciNet  Google Scholar 

  14. N.J. Larsson, Structures of String Matching and Data Compression. Ph.D. Dissertation, Dept. of Computer Science, Lund University, Sweden, 1999.

    Google Scholar 

  15. A. Lempel and J. Ziv On the complexity of finite sequences, IEEE Transactions on Information Theory, pp. 22:75–81, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  16. M. Miyazaki, A. Shinohara, and M. Takeda, An Improved Pattern Matching for Strings in Terms of Straight-Line Programs, Journal of Discrete Algorithms, Vol. 1(1), pp. 187–204, 2000.

    MathSciNet  Google Scholar 

  17. L. Mouchard, Presentation at London Algorithms Workshop, LAW’2000, King’s College London.

    Google Scholar 

  18. C. Nevill-Manning and I. Witten, Identifying Hierarchical Structure in Sequences: A Linear-Time Algorithm, Journal of Artificial Intelligence, Vol. 7, pp. 67–82, 1997.

    MATH  Google Scholar 

  19. H. Schorr and W.M. Waite, An Efficient Machine-Independent Procedure for Garbage Collection in Various List Structure, In CACM 8(10), August 1967.

    Google Scholar 

  20. Y. Shibata, T. Kida, S. Fukamachi, M. Takeda, A. Shinohara, T. Shinohara, Speeding up pattern matching by text compression, In Proceedings of 4th Italian Conference on Algorithms and Complexity, CIAC 2000, March 1–3, 2000 Rome, Italy.

    Google Scholar 

  21. J. Ziv and A. Lempel, A universal algorithm for sequential data compression, IEEE Transactions on Information Theory, pp. IT-23(3):337–343, 1977.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gasieniec, L., Potapov, I. (2001). Time/Space Efficient Compressed Pattern Matching. In: Freivalds, R. (eds) Fundamentals of Computation Theory. FCT 2001. Lecture Notes in Computer Science, vol 2138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44669-9_15

Download citation

  • DOI: https://doi.org/10.1007/3-540-44669-9_15

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42487-1

  • Online ISBN: 978-3-540-44669-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics