Skip to main content

Pattern matching in compressed texts

Preliminary version

  • Algorithms
  • Conference paper
  • First Online:
Foundations of Software Technology and Theoretical Computer Science (FSTTCS 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1026))

Abstract

We consider the problem of pattern matching when the text is in compressed form. As in Amir, Benson and Farach, we assume that the text is compressed by the Lempel-Ziv-Welch scheme. If the compressed text is of length n and the pattern is of length of m, our basic compression algorithm runs in O(n+m√m log m) steps, as against Amir, et al's bound of O(n+m 2) steps. We extend the basic algorithm into another that achieves, for any k ≥1, O(nk+m1+1/k log m) steps.

Supported by NSF Grants CCR9107293 and CCR9508545

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, J.E. Hopcroft, and J.D. Ullman. The design and analysis of computer algorithms. Addison-Wesley Publishing Co., Reading, Mass., 1974.

    Google Scholar 

  2. A. Amir, G. Benson, and M. Farach. Let sleeping files lie: Pattern matching in Z-compressed files. Proc. of 5th Annual ACM-SIAM Symp. on Discrete Algorithms, pages 705–714, 1994.

    Google Scholar 

  3. T. Eilam-Tsoreff and U. Vishkin. Matching patterns in a string subject to multilinear transformations. Proc. of International Workshop on Sequences, Combinatorics, Compression, Security and Transmission, Salerno, Italy, June 1988.

    Google Scholar 

  4. M. Farach and M. Thorup. Pattern matching in Lempel-Ziv compressed strings. Proc. of 27th Annual ACM Symp. on Theory of Computing, pages 703–712, 1995.

    Google Scholar 

  5. J. JaJa. An introduction to parallel algorithms. Addison Wesley Publishing Co., Reading, Mass., 1992.

    Google Scholar 

  6. D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM J. on Computing, pages 323–350, 1977.

    Google Scholar 

  7. E. M. McCreight. A space-economical suffix tree construction algorithm. J. of the ACM, pages 262–272, 1976.

    Google Scholar 

  8. B. Schieber and U. Vishkin. On finding lowest common ancestors: Simplification and parallelization. SIAM J. on Computing, pages 1253–1262, 1988.

    Google Scholar 

  9. R. Sundar. Twists, turns, cascades, deque conjecture, and scanning theorem. Proc. of 30th Annual IEEE Symp. on Foundations of Computer Science, pages 555–559, 1989.

    Google Scholar 

  10. R. Tarjan. Efficiency of a Good But Not Linear Set Union Algorithm. J. of ACM, pages 215–225, 1975.

    Google Scholar 

  11. R. Tarjan. Sequential access in splay trees takes linear time. Combinatorica, pages 367–378, 1985.

    Google Scholar 

  12. P. Weiner. Linear pattern matching algorithm. Proc. of 14th Annual IEEE Symp. on Switching and Automata Theory, pages 1–11, 1973.

    Google Scholar 

  13. T. A. Welch. A technique for high-performance data compression. IEEE Computer, pages 8–19, 1984.

    Google Scholar 

  14. J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Trans. on Information Theory, pages 337–343, 1977.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

P. S. Thiagarajan

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kosaraju, S.R. (1995). Pattern matching in compressed texts. In: Thiagarajan, P.S. (eds) Foundations of Software Technology and Theoretical Computer Science. FSTTCS 1995. Lecture Notes in Computer Science, vol 1026. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60692-0_60

Download citation

  • DOI: https://doi.org/10.1007/3-540-60692-0_60

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60692-5

  • Online ISBN: 978-3-540-49263-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics