Abstract
We present almost linear time (O(n·log |Σ|) time) O(log n)-ratio approximation of minimal grammar-based compression of a given string of length n over an alphabet Σ and O(k · log n) time transformation of LZ77 encoding of size k into a grammar-based encoding of size O(k · logn). Computing exact size of the minimal grammar-based compression is known to be NP-complete. The basic novel tool is the AVL-grammar.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Apostolico, S. Leonardi, Some theory and practice of greedy off-line textual substitution, DCC 1998, pp. 119–128
P. Berman, M. Karpinski, L. L. Larmore, W. Plandowski, and W. W. Rytter, On the Complexity of Pattern Matching for Highly Compressed Two-Dimensional Texts, Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching LNCS 1264, Edited by A. Apostolico and J. Hein, (1997), pp. 40–51. Full version to appear in JCSS
M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, New York (1994)
M. Farach and M. Thorup, String matching in Lempel-Ziv compressed strings, Proceedings of the 27th Annual Symposium on the Theory of Computing (1995), pp. 703–712
L. Gcasieniec, M. Karpinski, W. Plandowski and W. Rytter, Efficient Algorithms for Lempel-Ziv Encoding, Proceedings of the 5th Scandinavian Workshop on Algorithm Theory. Springer-Verlag (1996)
Martin Farach, “Optimal suffix tree construction with large alphabets”, FOCS 1997
M. Hirao, A. Shinohara, M. Takeda, S. Arikawa, Faster fully compressed pattern matching algorithm for balanced straight-line programs”, Proc. of 7th International Symposium on String Processing and Information Retrieval (SPIRE2000), pp. 132–138. IEEE Computer Society, September 2000
M. Karpinski, W. Rytter and A. Shinohara, Pattern-matching for strings with short description, Nordic Journal of Computing, 4(2):172–186, 1997
J. Kieffer, E. Yang, Grammar-based codes: a new class of universal lossless source codes, IEEE Trans. on Inf. Theory 46 (2000) pp. 737–754
D. Knuth, The Art of Computing, Vol. III Second edition. Addison-Wesley (1998), page. 474
J. K. Lanctot, Ming Li, En-hui Yang, Estimating DNA Sequence Entropy, SODA 2000
E. Lehman, A. Shelat, Approximation algorithms for grammar-based compression, SODA 2002
J. Ziv and A. Lempel, A Universal algorithm for sequential data compression, IEEE Transactions on Information Theory IT-23 (1977), pp. 337–343
M. Miyazaki, A. Shinohara, M. Takeda, An improved pattern-matching algorithm for strings in terms of straight-line programs, Journal of Discrete Algorithms, Vol. 1, pp. 187–204, 2000
C. Nevill-Manning, Inferring sequential structure, PhD thesis, University of Waikato, 1996
W. Rytter, Compressed and fully compressed pattern-matching in one and two-dimensions, Proceedings of IEEE, November 2000, Volume 88, Number 11, pp. 1769–1778
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rytter, W. (2002). Application of Lempel-Ziv Factorization to the Approximation of Grammar-Based Compression. In: Apostolico, A., Takeda, M. (eds) Combinatorial Pattern Matching. CPM 2002. Lecture Notes in Computer Science, vol 2373. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45452-7_3
Download citation
DOI: https://doi.org/10.1007/3-540-45452-7_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43862-5
Online ISBN: 978-3-540-45452-6
eBook Packages: Springer Book Archive