Skip to main content

Randomized efficient algorithms for compressed strings: the finger-print approach

Extended abstract

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1075))

Abstract

Denote by LZ(ω) the coded form of a string ω produced by Lempel-Ziv encoding algorithm. We consider several classical algorithmic problems for texts in the compressed setting. The first of them is the equality-testing: given LZ(ω) and integers i, j, k test the equality: ω[i...i+k]=ω[j...j+k]. We give a simple and efficient randomized algorithm for this problem using the finger-printing idea. The equality testing is reduced to the equivalence of certain context-free grammars generating single strings. The equality-testing is the bottleneck in other algorithms for compressed texts. We relate the time complexity of several classical problems for texts to the complexity Eq(n) of equality-testing. Assume nLZ(T)¦, mLZ(P)¦ and UT¦. Then we can compute the compressed representations of the sets of occurrences of P in T, periods of T, palindromes of T, and squares of T respectively in times O(n log2 U · Eq(m)+n 2 log U), O(n log2 U · Eq(n)+n 2 log U), O(n log2 U · Eq(n)+n 2 log U) and O(n 2 log3 U · Eq(n)+n 3 log2 U), where Eq(n)=O(n log log n). The randomization improves considerably upon the known deterministic algorithms ([7] and [8]).

On leave from Institute of Informatics, Warsaw University, ul. Banacha 2, 02-097, Warszawa, Poland. WWW: http://zaa.mimuw.edu.pl/@lechu/lechu.html.

This research was partially supported by the DFG Grant KA 673/4-1, and by the ESPRIT BR Grant 7097 and the ECUS 030.

Supported partially by the grant KBN 8T11C01208.

Supported partially by the grant KBN 8T11C01208.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.Amir, G. Benson and M. Farach, Let sleeping files lie: pattern-matching in Z-compressed files, in SODA'94.

    Google Scholar 

  2. A.Amir, G. Benson, Efficient two dimensional compressed matching, Proc. of the 2nd IEEE Data Compression Conference 279–288 (1992)

    Google Scholar 

  3. A.Amir, G. Benson and M. Farach, Optimal two-dimensional compressed matching, in ICALP'94

    Google Scholar 

  4. A. Apostolico, D. Breslauer, Z. Galil, Optimal parallel algorithms for periods, palindromes and squares, in ICALP'92, 296–307

    Google Scholar 

  5. M. Farach and M. Thorup, String matching in Lempel-Ziv compressed strings, in STOC'95, pp. 703–712.

    Google Scholar 

  6. R.M. Karp and M. Rabin, Efficient randomized pattern matching algorithms, IBM Journal of Research and Dev. 31, pp.249–260 (1987).

    Google Scholar 

  7. M. Karpinski, W. Plandowski and W. Rytter, The fully compressed string matching for Lempel-Ziv encoding. Technical Report, Institute of Informatics, Bonn University (1995)

    Google Scholar 

  8. M. Karpinski, W. Rytter and A. Shinohara, Pattern-matching for strings with short description, in Combinatorial Pattern Matching, 1995

    Google Scholar 

  9. D. Knuth, The Art of Computing, Vol. II: Seminumerical Algorithms. Second edition. Addison-Wesley (1981).

    Google Scholar 

  10. A. Lempel and J.Ziv, On the complexity of finite sequences, IEEE Trans. on Inf. Theory 22, 75–81 (1976)

    Google Scholar 

  11. W. Plandowski, Testing equivalence of morphisms on context-free languages, ESA'94, Lecture Notes in Computer Science 855, Springer-Verlag, 460–470 (1994).

    Google Scholar 

  12. J.Ziv and A.Lempel, A universal algorithm for sequential data compression, IEEE Trans. on Inf. Theory 17, 8–19, 1984

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dan Hirschberg Gene Myers

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W. (1996). Randomized efficient algorithms for compressed strings: the finger-print approach. In: Hirschberg, D., Myers, G. (eds) Combinatorial Pattern Matching. CPM 1996. Lecture Notes in Computer Science, vol 1075. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61258-0_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-61258-0_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61258-2

  • Online ISBN: 978-3-540-68390-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics