skip to main content
10.1145/1242572.1242714acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Visibly pushdown automata for streaming XML

Published: 08 May 2007 Publication History

Abstract

We propose the study of visibly pushdown automata (VPA) for processing XML documents. VPAs are pushdown automata where the input determines the stack operation, and XML documents are naturally visibly pushdown with the VPA pushing onto the stack on open-tags and popping the stack on close-tags. In this paper we demonstrate the power and ease visibly pushdown automata give in the design of streaming algorithms for XML documents.
We study the problems of type-checking streaming XML documents against SDTD schemas, and the problem of typing tags in a streaming XML document according to an SDTD schema. For the latter problem, we consider both pre-order typing and post-order typing of a document, which dynamically determines types at open-tags and close-tags respectively as soon as they are met. We also generalize the problems of pre-order and post-order typing to prefix querying. We show that a deterministic VPA yields an algorithm to the problem of answering in one pass the set of all answers to any query that has the property that a node satisfying the query is determined solely by the prefix leading to the node. All the streaming algorithms we develop in this paper are based on the construction of deterministic VPAs, and hence, for any fixed problem, the algorithms process each element of the input in constant time, and use space (d), where d is the depth of the document.

References

[1]
Serge Abiteboul, Peter Buneman, and Dan Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999.
[2]
A. V. Aho and J. D. Ullman. Translations on a context free grammar. Information and Control, 19(19):439--475, 1971.
[3]
R. Alur, V. Kumar, P. Madhusudan, and M. Viswanathan. Congruences for visibly pushdown languages. In ICALP, pages 1102--1114, 2005.
[4]
R. Alur and P. Madhusudan. Visibly pushdown languages. In STOC, pages 202--211. ACM Press, 2004.
[5]
R. Alur and P. Madhusudan. Adding nesting structure to words. In DLT, LNCS 4036, pages 1--13, 2006.
[6]
Mikolaj Bojanczyk and Thomas Colcombet. Tree-walking automata do not recognize all regular languages. In STOC, pages 234--243, New York, NY, USA, 2005. ACM Press.
[7]
Yi Chen, Susan B. Davidson, and Yifeng Zheng. An efficient XPath query processor for XML streams. In ICDE, page 79, 2006.
[8]
A. Desai. Introduction to sequential xpath. In Proc. of IDEAlliance XML Conference, 2001.
[9]
Todd J. Green, Gerome Miklau, Makoto Onizuka, and Dan Suciu. Processing XML streams with deterministic automata. In ICDT '03, pages 173--189, 2003. Springer-Verlag.
[10]
Martin Grohe and Nicole Schweikardt. Lower bounds for sorting with few random accesses to external memory. In PODS '05, pages 238--249, 2005. ACM Press.
[11]
Ashish Kumar Gupta and Dan Suciu. Stream processing of XPath queries with predicates. In SIGMOD, pages 419--430. ACM Press, 2003.
[12]
J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979.
[13]
Christoph Koch and Stefanie Scherzinger. Attribute grammars for scalable query processing on XML streams. In DBPL, volume 2921 of LNCS, pages 233--256, 2003.
[14]
V. Kumar, P. Madhusudan, and M. Viswanathan. Minimization, learning, and conformance testing of boolean programs. In CONCUR, 2006. to appear.
[15]
V. Kumar, P. Madhusudan, and M. Viswanathan. Visibly pushdown languages for xml. Technical Report UIUCDCS-R-2006-2704, UIUC, 2006.
[16]
B. Ludäscher, P. Mukhopadhyay, and Y. Papakonstantinou. A transducer-based XML query processor. In VLDB, pages 227--238, 2002.
[17]
Wim Martens, Frank Neven, and Thomas Schwentick. Which XML schemas admit 1-pass preorder typing? In ICDT, pages 68--82, 2005.
[18]
M. Murata, D. Lee, and M. Mani. Taxonomy of XML Schema Languages using Formal Language Theory. In Extreme Markup Languages, Montreal, Canada, 2001.
[19]
Andreas Neumann and Helmut Seidl. Locating matches of tree patterns in forests. In FSTTCS, pages 134--145, 1998. Springer-Verlag.
[20]
Frank Neven. Automata, logic, and XML. In CSL, pages 2--26, London, UK, 2002. Springer-Verlag.
[21]
Frank Neven and Thomas Schwentick. On the power of tree-walking automata. Information Computing, 183(1):86--103, 2003.
[22]
Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In PODS, pages 35--46, New York, NY, USA, 2000. ACM Press.
[23]
Corin Pitcher. Visibly pushdown expression effects for XML stream processing. In Programming Language Technologies for XML, pages 1--14, 2005.
[24]
Luc Segoufin and Victor Vianu. Validating streaming XML documents. In PODS, pages 53--64, New York, NY, USA, 2002. ACM Press.
[25]
Victor Vianu. XML: From practice to theory. In SBBD, pages 11--25, 2003.
[26]
World Wide Web Consortium. Extended Markup Language (XML). http://www.w3.org.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. pushdown automata
  3. query
  4. schema
  5. streaming algorithms
  6. typing

Qualifiers

  • Article

Conference

WWW'07
Sponsor:
WWW'07: 16th International World Wide Web Conference
May 8 - 12, 2007
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Streaming enumeration on nested documentsACM Transactions on Database Systems10.1145/3701557Online publication date: 25-Oct-2024
  • (2024)V-Star: Learning Visibly Pushdown Grammars from Program InputsProceedings of the ACM on Programming Languages10.1145/36564588:PLDI(2003-2026)Online publication date: 20-Jun-2024
  • (2023)Subhedge Projection for Stepwise Hedge AutomataFundamentals of Computation Theory10.1007/978-3-031-43587-4_2(16-31)Online publication date: 18-Sep-2023
  • (2023)Validating Streaming JSON Documents with Learned VPAsTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-031-30823-9_14(271-289)Online publication date: 22-Apr-2023
  • (2023)Extending Visibly Pushdown Automata over Multi-matching Nested RelationsStructured Object-Oriented Formal Language and Method10.1007/978-3-031-29476-1_5(59-69)Online publication date: 25-Mar-2023
  • (2022)PDAAAL: A Library for Reachability Analysis of Weighted Pushdown SystemsAutomated Technology for Verification and Analysis10.1007/978-3-031-19992-9_14(225-230)Online publication date: 21-Oct-2022
  • (2020)Multi-matching Nested RelationsTheoretical Computer Science10.1016/j.tcs.2020.12.004Online publication date: Dec-2020
  • (2020)Transforming Multi-matching Nested Traceable Automata to Multi-matching Nested ExpressionsCombinatorial Optimization and Applications10.1007/978-3-030-64843-5_22(320-333)Online publication date: 4-Dec-2020
  • (2018)Certain Query Answering on Compressed String Patterns: From Streams to HyperstreamsReachability Problems10.1007/978-3-030-00250-3_9(117-132)Online publication date: 30-Aug-2018
  • (2017)Minimization of Visibly Pushdown Automata Using Partial Max-SATProceedings, Part I, of the 23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 1020510.1007/978-3-662-54577-5_27(461-478)Online publication date: 22-Apr-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media