Abstract
An exact pattern matching problem is to find all occurrences of a pattern p in a text t. We say that the pattern matching algorithm is optimal if its running time is linear in the sizes of t and p, i.e. O(t + p). Perhaps one of the most interesting settings of the pattern matching problem is when one has to design an efficient algorithm with a help of small extra space. In this paper we explore this setting to the extreme. We use an additional assumption that the text t is available only in a compressed form, represented by a straight-line program. The compression methods based on efficient construction of straight-line programs are as competitive as the compression standards, including Lempel-Ziv’s compression scheme and recently intensively studied compression via block sorting, due to Burrows and Wheeler. Our main result consists in solving compressed string matching problem in optimal linear time when only a constant size of extra space is available. We also discuss an efficient implementation of a version our algorithm showing that the new concept may have also interesting real applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Amir, G. Benson, and M. Farach, Let sleeping files lie: Pattern matching in Z-compressed files, Proc. of 5th Annual ACM-SIAM Symposium on Discrete Algorithms, January 1994.
A. Amir, G.M. Landau, and D. Sokol, Inplace Run-Length 2d Compressed Search, In Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’2000, San Francisco, pp. 817–818.
D. Breslauer. Saving comparisons in the Crochemore-Perrin string matching algorithm. Theoretical Computer Science, 158(1–2):177–192, May 1996.
M. Crochemore and D. Perrin. Two-way string-matching. Journal of the ACM, 38(3):651–675, July 1991.
M. Crochemore, W. Rytter, Text algorithms, Oxford University Press, 1994.
M. Farach and M. Thorup, String Matching in Lempel-Ziv Compressed Strings, Proc. 27th ACM Symposium on Theory of Computing, pp. 703–713, 1994.
P. Ferragina and G. Manzini, Opportunistic Data Structures with Applications. Proc. 41st IEEE Symposium on Foundations of Computer Science, (FOCS’00). Redondo Beach (CA), 2000, pp. 390–398.
Z. Galil and J. Seiferas. Time-space-optimal string matching. Journal of Computer and System Sciences, 26(3):280–294, June 1983.
L. Gcasieniec, W. Plandowski, and W. Rytter. The zooming method: a recursive approach to time-space efficient string-matching. Theoretical Computer Science, 147(1–2):19–30, August 1995
L. Gcasieniec, W. Plandowski, and W. Rytter. Constant-space string matching with smaller number of comparisons: Sequential sampling. In Proc. of 6th Combinatorial Pattern Matching, LNCS 937, pages 78–89, Espoo, Finland, July 5–7, 1995.
L. Gcasieniec and W. Rytter. Almost optimal fully compressed pattern matching. In Proceedings of Data Compression Conference (DCC’99), Snowbird, March 1999.
J.C. Kieffer, A Survey of Advances in Hierarchical Data Compression, Technical Report, Department of Electrical & Computer Engineering, University of Minnesota, 2000.
D. Knuth, J. Morris, and V. Pratt, Fast pattern matching in strings, SIAM J. on Computing, 6 (1977), pp. 323–360.
N.J. Larsson, Structures of String Matching and Data Compression. Ph.D. Dissertation, Dept. of Computer Science, Lund University, Sweden, 1999.
A. Lempel and J. Ziv On the complexity of finite sequences, IEEE Transactions on Information Theory, pp. 22:75–81, 1976.
M. Miyazaki, A. Shinohara, and M. Takeda, An Improved Pattern Matching for Strings in Terms of Straight-Line Programs, Journal of Discrete Algorithms, Vol. 1(1), pp. 187–204, 2000.
L. Mouchard, Presentation at London Algorithms Workshop, LAW’2000, King’s College London.
C. Nevill-Manning and I. Witten, Identifying Hierarchical Structure in Sequences: A Linear-Time Algorithm, Journal of Artificial Intelligence, Vol. 7, pp. 67–82, 1997.
H. Schorr and W.M. Waite, An Efficient Machine-Independent Procedure for Garbage Collection in Various List Structure, In CACM 8(10), August 1967.
Y. Shibata, T. Kida, S. Fukamachi, M. Takeda, A. Shinohara, T. Shinohara, Speeding up pattern matching by text compression, In Proceedings of 4th Italian Conference on Algorithms and Complexity, CIAC 2000, March 1–3, 2000 Rome, Italy.
J. Ziv and A. Lempel, A universal algorithm for sequential data compression, IEEE Transactions on Information Theory, pp. IT-23(3):337–343, 1977.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gasieniec, L., Potapov, I. (2001). Time/Space Efficient Compressed Pattern Matching. In: Freivalds, R. (eds) Fundamentals of Computation Theory. FCT 2001. Lecture Notes in Computer Science, vol 2138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44669-9_15
Download citation
DOI: https://doi.org/10.1007/3-540-44669-9_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42487-1
Online ISBN: 978-3-540-44669-9
eBook Packages: Springer Book Archive