Skip to main content

An Efficient Pattern Matching Algorithm on a Subclass of Context Free Grammars

  • Conference paper
Developments in Language Theory (DLT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3340))

Included in the following conference series:

Abstract

There is a close relationship between formal language theory and data compression. Since 1990’s various types of grammar-based text compression algorithms have been introduced. Given an input string, a grammar-based text compression algorithm constructs a context-free grammar that only generates the string. An interesting and challenging problem is pattern matching on context-free grammars \(\mathcal{P}\) of size m and \(\mathcal{T}\) of size n, which are the descriptions of pattern string P of length M and text string T of length N, respectively. The goal is to solve the problem in time proportional only to m and n, not to M nor N. Kieffer et al. introduced a very practical grammar-based compression method called multilevel pattern matching code (MPM code). In this paper, we propose an efficient pattern matching algorithm which, given two MPM grammars \(\mathcal{P}\) and \(\mathcal{T}\), performs in O(mn 2) time with O(mn) space. Our algorithm outperforms the previous best one by Miyazaki et al. which requires O(m 2 n 2) time and O(mn) space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bryant, R.E.: Symbolic boolean manipulation with ordered binary decision diagrams. ACM Computing Surveys 24, 293–318 (1992)

    Article  Google Scholar 

  2. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Rasala, A., Sahai, A., Shelat, A.: Approximating the smallest grammar: Kolmogorov complexity in natural models. In: Proc. STOC 2002, pp. 792–801 (2002)

    Google Scholar 

  3. Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, New York (1994)

    MATH  Google Scholar 

  4. Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific, Singapore (2002)

    Book  Google Scholar 

  5. Gage, P.: A new algorithm for data compression. The C Users Journal 12(2) (1994)

    Google Scholar 

  6. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, New York (1997)

    Book  MATH  Google Scholar 

  7. Inenaga, S., Shinohara, A., Takeda, M.: A fully compressed pattern matching algorithm for simple collage systems. In: Proc. PSC 2004, pp. 98–113. Czech Technical University (2004)

    Google Scholar 

  8. Karpinski, M., Rytter, W., Shinohara, A.: An efficient pattern-matching algorithm for strings with short descriptions. Nordic J. Comput. 4(2), 172–186 (1997)

    MATH  MathSciNet  Google Scholar 

  9. Kieffer, J., Yang, E.: Grammar-based codes: a new class of universal lossless source codes. IEEE Transactions on Information Theory 46(3), 737–754 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  10. Kieffer, J., Yang, E.: Grammar-based codes for universal lossless data compression. Communications in Information and Systems 2(2), 29–52 (2002)

    MATH  MathSciNet  Google Scholar 

  11. Kieffer, J., Yang, E., Nelson, G., Cosman, P.: Universal lossless compression via multilevel pattern matching. IEEE Transactions on Information Theory 46(4), 1227–1245 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  12. Larsson, J., Moffat, A.: Offline dictionary-based compression. In: Proc. DCC 1999, pp. 296–305. IEEE Computer Society Press, Los Alamitos (1999)

    Google Scholar 

  13. Miyazaki, M., Shinohara, A., Takeda, M.: An improved pattern matching algorithm for strings in terms of straight line programs. Journal of Discrete Algorithms 1(1), 187–204 (2000)

    MathSciNet  Google Scholar 

  14. Nevill-Manning, C., Witten, I.: Compression and explanation using hierarchical grammars. Computer Journal 40(2/3), 103–116 (1997)

    Article  Google Scholar 

  15. Nevill-Manning, C., Witten, I.: Identifying hierarchical structure in sequences: a linear-time algorithm. J. Artificial Intelligence Research 7, 67–82 (1997)

    MATH  Google Scholar 

  16. Nevill-Manning, C., Witten, I.: Inferring lexical and grammatical structure from sequences. In: Proc. DCC 1997, pp. 265–274. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  17. Nevill-Manning, C., Witten, I.: Phrase hierarchy inference and compression in bounded space. In: Proc. DCC 1998, pp. 179–188. IEEE Computer Society Press, Los Alamitos (1998)

    Google Scholar 

  18. Rytter, W.: Algorithms on compressed strings and arrays. In: Bartosek, M., Tel, G., Pavelka, J. (eds.) SOFSEM 1999. LNCS, vol. 1725, pp. 48–65. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  19. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoretical Comput. Sci. 302(1–3), 211–222 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  20. Woelfel, P.: Symbolic topological sorting with OBDDs. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 671–680. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Inenaga, S., Shinohara, A., Takeda, M. (2004). An Efficient Pattern Matching Algorithm on a Subclass of Context Free Grammars. In: Calude, C.S., Calude, E., Dinneen, M.J. (eds) Developments in Language Theory. DLT 2004. Lecture Notes in Computer Science, vol 3340. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30550-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30550-7_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24014-3

  • Online ISBN: 978-3-540-30550-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics