Abstract
Document specification languages like XML, model docu- ments using extended context-free grammars. These differ from standard context-free grammars in that they allow arbitrary regular expressions on the right-hand side of productions. To query such documents, we in- troduce a new form of attribute grammars (extended AGs) that work directly over extended context-free grammars rather than over standard context-free grammars. Viewed as a query language, extended AGs are particularly relevant as they can take into account the inherent order of the children of a node in a document. We show that two key properties of standard attribute grammars carry over to extended AGs: efficiency of evaluation and decidability of well-definedness. We further characterize the expressiveness of extended AGs in terms of monadic second-order logic and establish the complexity of their non-emptiness and equiva- lence problem to be complete for EXPTIME. As an application we show that the Region Algebra expressions can be efficiently translated into extended AGs. This translation drastically improves the known upper bound on the complexity of the emptiness and equivalence test for Re- gion Algebra expressions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999. 101
S. Abiteboul, S. Cluet, and T. Milo. A logical view of structured files. VLDB Journal, 7(2):96–114, 1998. 99, 114
R. Beaza-Yates and G. Navarro. Integrating contents and structure in text retrieval. ACM SIGMOD Record, 25(1):67–79, March 1996. 101
C. Beeri and T. Milo. Schemas for integration and translation of structured and semi-structured data. In P. Buneman C. Beeri, editor, Database Theory — ICDT99, volume 1540 of Lecture Notes in Computer Science, pages 296–313. Springer-Verlag, 1998. 99
R. Bloem and J. Engelfriet. Characterization of properties and relations definedin monadic second order logic on the nodes of trees. Technical Report 97-03, Rijksuniversiteit Leiden, 1997. 102
R. Bloem and J. Engelfriet. Monadic second order logic and node relations on graphs and trees. In J. Mycielski, G. Rozenberg, and A. Salomaa, editors, Structures in Logic and Computer Science, volume 1261 of Lecture Notes in Computer Science, pages 144–161. Springer-Verlag, 1997. 109
R. Book, S. Even, S. Greibach, and G. Ott. Ambiguity in graphs and expressions. IEEE Transactions on Computers, c-20(2):149–153, 1971. 100, 103
A. Brüggemann-Klein and Wood D. One unambiguous regular languages. Information and Computation, 140(2):229–253, 1998. 100
A. Brüggemann-Klein, M. Murata, and D. Wood. Regular tree languages over non-ranked alphabets (draft 1). Unpublished manuscript, 1998. 100, 114
B. S. Chlebus. Domino-tiling games. Journal of Computer and System Sciences, 32(3):374–392, 1986. 111
M. Consens and T. Milo. Algebras for querying text regions: Expressive power and optimization. Journal of Computer and System Sciences, 3:272–288, 1998. 102, 110, 111, 112
V. Crescenzi and G. Mecca. Grammars have exceptions. Information Systems–Special Issue on Semistructured Data, 23(8):539–565, 1998. 114
P. Deransart, M. Jourdan, and B. Lorho. Attribute Grammars: Definition, Systems and Bibliography, volume 323 of Lecture Notes in Computer Science. Springer, 1988. 100
A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. XML-QL: a query language for XML. In Proceedings of the WWW8 Conference, Toronto, 1999.
H.-D. Ebbinghaus and J. Flum. Finite Model Theory. Springer, 1995. 110
F. Gécseg and M. Steinby. Tree languages. In Rozenberg and Salomaa [34], chapter 1. 100
N. Globerman and D. Harel. Complexity results for two-way and multi-pebble automata and their logics. Theoretical Computer Science, 169(2):161–184, 1996.
G. H. Gonnet and F.W. Tompa. Mind your grammar: a new approach to modelling text. In Proceedings 13th Conference on VLDB, pages 339–346, 1987. 99
M. Gyssens, J. Paredaens, and D. Van Gucht. A grammar-based approach towards unifying hierarchical data models. SIAM Journal on Computing, 23(6):1093–1137, 1994. 99
M. Jazayeri, W. F. Ogden, and W. C. Rounds. The intrinsically exponential complexity of the circularity problem for attribute grammars. Communications of the ACM, 18(12):697–706, 1975. 102, 109
P. Kilpeläinen, G. Lindén, H. Mannila, and E. Nikunen. A structured text database system. In R. Furuta, editor, Proceedings of the International Conference on Electronic Publishing, Document Manipulation & Typography, The Cambridge Series on Electronic Publishing, pages 139–151. Cambridge University Press, 1990. 99, 114
P. Kilpeläinen and H. Mannila. Retrieval from hierarchical texts by partial patterns. In Proceedings of the Sixteenth International Conference on Research and Development in Information Retrieval, pages 214–222. ACM Press, 1993. 101
P. Kilpeläinen and H. Mannila. Query primitives for tree-structured data. In M. Crochemore and D. Gusfield, editors, Proceedings of the fifth Symposium on Combinatorial Pattern Matching, pages 213–225. Springer-Verlag, 1994. 101
D. E. Knuth. Semantics of context-free languages. Mathematical Systems Theory, 2(2):127–145, 1968. See also Mathematical Systems Theory, 5(2):95–96, 1971. 100, 102
M. Murata. Forest-regular languages and tree-regular languages. Unpublished manuscript, 1995. 100
A. Neumann and H. Seidl. Locating matches of tree patterns in forests. In V. Arvind and R. Ramanujam, editors, Foundations of Software Technology and Theoretical Computer Science, Lecture Notes in Computer Science, pages 134–145. Springer, 1998. 101
F. Neven. Structured document query languages based on attribute grammars: locality and non-determinism. In T. Ripke T. Polle and K.-D. Schewe, editors, Fundamentals of Information Systems, pages 129–142. Kluwer, 1998. 100, 104
F. Neven. Design and Analysis of Query Languages for Structured Documents — A Formal and Logical Approach. Doctor’s thesis, Limburgs Universitair Centrum (LUC), 1999. 102, 110
F. Neven and T. Schwentick. Query automata. In Proceedings of the Eighteenth ACM Symposium on Principles of Database Systems, pages 205–214. ACM Press, 1999. 114
F. Neven and J. Van den Bussche. On implementing structured document query facilities on top of a DOOD. In F. Bry, R. Ramakrishnan, and K. Ramamohanarao, editors, Deductive and Object-Oriented Databases, volume 1341 of Lecture Notes in Computer Science, pages 351–367. Springer-Verlag, 1997. 100, 104
F. Neven and J. Van den Bussche. Expressiveness of structured document query languages based on attribute grammars. In Proceedings of the Seventeenth ACM Symposium on Principles of Database Systems, pages 11–17. ACM Press, 1998. 99, 100, 102, 104, 110
C. Pair and A. Quere. Définition et etude des bilangages réguliers. Information and Control, 13(6):565–593, 1968. 100
G. Rozenberg and A. Salomaa, editors. Handbook of Formal Languages, volume 3. Springer, 1997. 115, 117
A. Salminen and F. Tompa. PAT expressions: an algebra for text search. Acta Linguistica Hungarica, 41:277–306, 1992. 111
D. Suciu. Semistructured data and XML. In Proceedings of the 5th International Conference on Foundations of Data Organization and Algorithms, 1998. 99, 101
M. Takahashi. Generalizations of regular sets and their application to a study of context-free languages. Information and Control, 27(1):1–36, 1975. 100
W. Thomas. Languages, automata, and logic. In Rozenberg and Salomaa [34], chapter 7. 100
M. Y. Vardi. Automata theory for database theoreticians. In Proceedings of the Eighth ACM Symposium on Principles of Database Systems, pages 83–92. ACM Press, 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Neven, F. (2000). Extensions of Attribute Grammars for Structured Document Queries. In: Connor, R., Mendelzon, A. (eds) Research Issues in Structured and Semistructured Database Programming. DBPL 1999. Lecture Notes in Computer Science, vol 1949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44543-9_7
Download citation
DOI: https://doi.org/10.1007/3-540-44543-9_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41481-0
Online ISBN: 978-3-540-44543-2
eBook Packages: Springer Book Archive