Skip to main content

Compressed Automata for Dictionary Matching

  • Conference paper
Implementation and Application of Automata (CIAA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7982))

Included in the following conference series:

  • 712 Accesses

Abstract

A variant of the dictionary matching problem is addressed where the dictionary is given in an SLP-compressed form. An Aho-Corasick automata-based algorithm is presented which pre-processes the compressed dictionary \(\mathcal{D}\) in O(n 4logn) time using O(n 2logN) space and recognizes all occurrences of the patterns in \(\mathcal{D}\) in amortized O(h + m) running time per character, where n and N are, respectively, the compressed and uncompressed sizes of \(\mathcal{D}\), and h is the height of \(\mathcal{D}\), and m is the number of patterns in the dictionary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Corasick, M.: Efficient string matching: An aid to bibliographic search. Comm. ACM 18(6), 333–340 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  2. Belazzougui, D.: Succinct dictionary matching with no slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  3. Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings. In: Proc. SODA 2011, pp. 373–389 (2011)

    Google Scholar 

  4. Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, New York (1994)

    MATH  Google Scholar 

  5. Gąsieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient algorithms for Lempel-Ziv encoding. In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  6. Karpinski, M., Rytter, W., Shinohara, A.: An efficient pattern-matching algorithm for strings with short descriptions. Nordic Journal of Computing 4, 172–186 (1997)

    MathSciNet  MATH  Google Scholar 

  7. Kida, T., Shibata, Y., Takeda, M., Shinohara, A., Arikawa, S.: Collage system: A unifying framework for compressed pattern matching. Theor. Comput. Sci. 298(1), 253–272 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  8. Larsson, N.J., Moffat, A.: Offline dictionary-based compression. In: Proc. DCC 1999, pp. 296–305. IEEE Computer Society (1999)

    Google Scholar 

  9. Miyazaki, M., Shinohara, A., Takeda, M.: An improved pattern matching algorithm for strings in terms of straight-line programs. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 1–11. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  10. Nevill-Manning, C.G., Witten, I.H., Maulsby, D.L.: Compression by induction of hierarchical grammars. In: Proc. DCC 1994, pp. 244–253 (1994)

    Google Scholar 

  11. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. Storer, J., Szymanski, T.: Data compression via textual substitution. J. ACM 29(4), 928–951 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  13. Weiner, P.: Linear pattern-matching algorithms. In: Proc. of 14th IEEE Ann. Symp. on Switching and Automata Theory, pp. 1–11. Institute of Electrical Electronics Engineers, New York (1973)

    Chapter  Google Scholar 

  14. Welch, T.A.: A technique for high performance data compression. IEEE Computer 17, 8–19 (1984)

    Article  Google Scholar 

  15. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory IT-23(3), 337–349 (1977)

    Article  MathSciNet  Google Scholar 

  16. Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Transactions on Information Theory 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

I, T., Nishimoto, T., Inenaga, S., Bannai, H., Takeda, M. (2013). Compressed Automata for Dictionary Matching. In: Konstantinidis, S. (eds) Implementation and Application of Automata. CIAA 2013. Lecture Notes in Computer Science, vol 7982. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39274-0_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39274-0_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39273-3

  • Online ISBN: 978-3-642-39274-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics