Skip to main content

Grammatical tree matching

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1992)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 644))

Included in the following conference series:

Abstract

In structured text databases documents are represented as parse trees, and different tree matching notions can be used as primitives for query languages. Two useful notions of tree matching, tree inclusion and tree pattern matching both seem to require superlinear time. In this paper we give a general sufficient condition for a tree matching problem to be solvable in linear time, and apply it to tree pattern matching and tree inclusion. The application is based on the notion of a nonperiodic parse tree. We argue that most text documents can be modeled in a natural way using grammars yielding nonperiodic parse trees. We show how the knowledge that the target tree is nonperiodic can be used to obtain linear time algorithms for the tree matching problems. We also discuss the preprocessing of patterns for grammatical tree matching.

Work supported by the Academy of Finland

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.

    Google Scholar 

  2. F. Bancilhon and P. Richard. Managing texts and facts in a mixed data base environment. In G. Gardarin and E. Gelenbe, editors, New Applications of Data Bases. Academic Press, 1984.

    Google Scholar 

  3. G. Coray, R. Ingold, and C. Vanoirbeek. Formatting structured documents: Batch versus interactive. In J.C. van Vliet, editor, Text Processing and Document Manipulation. Cambridge University Press, 1986.

    Google Scholar 

  4. M. Dubiner, Z. Galil, and E. Magen. Faster tree pattern matching. In Proc. of the Symposium on Foundations of Computer Science (FOCS'90), pages 145–150, 1990.

    Google Scholar 

  5. P. Dublish. Some comments on the subtree isomorphism problem for ordered trees. Information Processing Letters, 36:273–275, 1990.

    Google Scholar 

  6. R. Furuta, V. Quint, and J. André. Interactively editing structured documents. Electronic Publishing, 1(1):19–44, 1988.

    Google Scholar 

  7. G. H. Gonnet and F. Wm. Tompa. Mind your grammar — a new approach to text databases. In Proc. of the Conference on Very Large Data Bases (VLDB'87), pages 339–346, 1987.

    Google Scholar 

  8. R. Grossi. A note on the subtree isomorphism for ordered trees and related problems. Information Processing Letters, 39:81–84, 1991.

    Google Scholar 

  9. C. M. Hoffman and M. J. O'Donnell. Pattern matching in trees. Journal of the ACM, 29(1):68–95, January 1982.

    Google Scholar 

  10. J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979.

    Google Scholar 

  11. P. Kilpeläinen, G. Lindén, H. Mannila, and E. Nikunen. A structured document database system. In Richard Furuta, editor, EP90 — Proceedings of the International Conference on Electronic Publishing, Document Manipulation & Typography, The Cambridge Series on Electronic Publishing. Cambridge University Press, 1990.

    Google Scholar 

  12. P. Kilpeläinen and H. Mannila. Ordered and unordered tree inclusion. Report A-1991-4, University of Helsinki, Dept. of Comp. Science, August 1991.

    Google Scholar 

  13. P. Kilpeläinen and H. Mannila. The tree inclusion problem. In Samson Abramsky and T.S.E. Maibaum, editors, TAPSOFT'91, Proc. of the International Joint Conference on the Theory and Practice of Software Development, Vol. 1: Colloqium on Trees in Algebra and Programming (CAAP'91), pages 202–214. Springer-Verlag, 1991.

    Google Scholar 

  14. P. Kilpeläinen and H. Mannila. A query language for structured text databases. Manuscript in preparation, February 1992.

    Google Scholar 

  15. S. R. Kosaraju. Efficient tree pattern matching. In Proc. of the Symposium on Foundations of Computer Science (FOCS'89), pages 178–183, 1989.

    Google Scholar 

  16. E. Mäkinen. On the subtree isomorphism problem for ordered trees. Information Processing Letters, 32:271–273, September 1989.

    Google Scholar 

  17. H. Mannila and K.-J. Räihä. On query languages for the p-string data model. In H. Kangassalo, S. Ohsuga, and H. Jaakkola, editors, Information Modelling and Knowledge Bases, pages 469–482. IOS Press, 1990.

    Google Scholar 

  18. E. Nikunen. Views in structured text databases. Phil.lic. thesis, University of Helsinki, Department of Computer Science, December 1990.

    Google Scholar 

  19. V. Quint and I. Vatton. GRIF: An interactive system for structured document manipulation. In J.C. van Vliet, editor, Proceedings of the International Conference on Text Processing and Document Manipulation. Cambridge University Press, 1986.

    Google Scholar 

  20. S. W. Reyner. An analysis of a good algorithm for the subtree problem. SIAM Journal of Computing, 6(4):730–732, December 1977.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Maxime Crochemore Zvi Galil Udi Manber

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kilpeläinen, P., Mannila, H. (1992). Grammatical tree matching. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_13

Download citation

  • DOI: https://doi.org/10.1007/3-540-56024-6_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56024-1

  • Online ISBN: 978-3-540-47357-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics