Abstract
Denote by LZ(ω) the coded form of a string ω produced by Lempel-Ziv encoding algorithm. We consider several classical algorithmic problems for texts in the compressed setting. The first of them is the equality-testing: given LZ(ω) and integers i, j, k test the equality: ω[i...i+k]=ω[j...j+k]. We give a simple and efficient randomized algorithm for this problem using the finger-printing idea. The equality testing is reduced to the equivalence of certain context-free grammars generating single strings. The equality-testing is the bottleneck in other algorithms for compressed texts. We relate the time complexity of several classical problems for texts to the complexity Eq(n) of equality-testing. Assume n=¦LZ(T)¦, m=¦LZ(P)¦ and U=¦T¦. Then we can compute the compressed representations of the sets of occurrences of P in T, periods of T, palindromes of T, and squares of T respectively in times O(n log2 U · Eq(m)+n 2 log U), O(n log2 U · Eq(n)+n 2 log U), O(n log2 U · Eq(n)+n 2 log U) and O(n 2 log3 U · Eq(n)+n 3 log2 U), where Eq(n)=O(n log log n). The randomization improves considerably upon the known deterministic algorithms ([7] and [8]).
On leave from Institute of Informatics, Warsaw University, ul. Banacha 2, 02-097, Warszawa, Poland. WWW: http://zaa.mimuw.edu.pl/@lechu/lechu.html.
This research was partially supported by the DFG Grant KA 673/4-1, and by the ESPRIT BR Grant 7097 and the ECUS 030.
Supported partially by the grant KBN 8T11C01208.
Supported partially by the grant KBN 8T11C01208.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A.Amir, G. Benson and M. Farach, Let sleeping files lie: pattern-matching in Z-compressed files, in SODA'94.
A.Amir, G. Benson, Efficient two dimensional compressed matching, Proc. of the 2nd IEEE Data Compression Conference 279–288 (1992)
A.Amir, G. Benson and M. Farach, Optimal two-dimensional compressed matching, in ICALP'94
A. Apostolico, D. Breslauer, Z. Galil, Optimal parallel algorithms for periods, palindromes and squares, in ICALP'92, 296–307
M. Farach and M. Thorup, String matching in Lempel-Ziv compressed strings, in STOC'95, pp. 703–712.
R.M. Karp and M. Rabin, Efficient randomized pattern matching algorithms, IBM Journal of Research and Dev. 31, pp.249–260 (1987).
M. Karpinski, W. Plandowski and W. Rytter, The fully compressed string matching for Lempel-Ziv encoding. Technical Report, Institute of Informatics, Bonn University (1995)
M. Karpinski, W. Rytter and A. Shinohara, Pattern-matching for strings with short description, in Combinatorial Pattern Matching, 1995
D. Knuth, The Art of Computing, Vol. II: Seminumerical Algorithms. Second edition. Addison-Wesley (1981).
A. Lempel and J.Ziv, On the complexity of finite sequences, IEEE Trans. on Inf. Theory 22, 75–81 (1976)
W. Plandowski, Testing equivalence of morphisms on context-free languages, ESA'94, Lecture Notes in Computer Science 855, Springer-Verlag, 460–470 (1994).
J.Ziv and A.Lempel, A universal algorithm for sequential data compression, IEEE Trans. on Inf. Theory 17, 8–19, 1984
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W. (1996). Randomized efficient algorithms for compressed strings: the finger-print approach. In: Hirschberg, D., Myers, G. (eds) Combinatorial Pattern Matching. CPM 1996. Lecture Notes in Computer Science, vol 1075. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61258-0_3
Download citation
DOI: https://doi.org/10.1007/3-540-61258-0_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61258-2
Online ISBN: 978-3-540-68390-2
eBook Packages: Springer Book Archive