Skip to main content
Log in

Querying XML streams

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

Efficient querying of XML streams will be one of the fundamental features of next-generation information systems. In this paper we propose the TurboXPath path processor, which accepts a language equivalent to a subset of the for-let-where constructs of XQuery over a single document. TurboXPath can be extended to provide full XQuery support or used to augment federated database engines for efficient handling of queries over XML data streams produced by external sources. Internally, TurboXPath uses a tree-shaped path expression with multiple outputs to drive the execution. The result of a query execution is a sequence of tuples of XML fragments matching the output nodes. Based on a streamed execution model, TurboXPath scales up to large documents and has limited memory consumption for increased concurrency. Experimental evaluation of a prototype demonstrates performance gains compared to other state-of-the-art path processors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Altinel M, Franklin M (2000) Efficient filtering of XML documents for selective dissemination of information. In: Proceedings of the 26th international conference on very large databases, Cairo, Egypt, 10-14 September 2000

  2. Al-Khalifa S, Jagadish HV, Koudas N, Patel JM, Srivastava D, Wu Y (2002) Structural joins: a primitive for efficient XML query pattern matching. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, 26 February-1 March 2002

  3. Barton C, Charles P, Goyal D, Raghavachari M, Fontoura M, Josifovski V (2003) Streaming XPath processing with forward and backward axes. In: Proceedings of the 19th international conference on data engineering, Bangalore, India, 5-8 March 2003

  4. Bruno N, Srivastava D, Koudas N (2002) Holistic twig joins: optimal XML pattern matching. In: Proceedings of SIGMOD, Madison, WI, 3-6 June 2002

  5. Chamberlin D, Clark J, Florescu D, Robie J, Simeon J, Stefanescu M (2003) XQuery 1.0: An XML query language, W3C Working Draft, August 2003. http://www.w3.org/TR/xquery

  6. Chan C-Y, Felber P, Garofalakis M, Rastogi R (2002) Efficient filtering of XML documents with XPath expressions. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, 26 February-1 March 2002

  7. Chen J, DeWitt DJ, Tian F, Wang Y (2000) NiagaraCQ: a scalable continuous query system for Internet databases. In: Proceedings of ACM SIGMOD, Dallas, TX, 15-18 May 2000

  8. Clark J, DeRose S (1999) XML Path Language (XPath) Version 1.0. W3C Recommendation 16 November 1999. http://www.w3.org/TR/1999/REC-xpath-19991116

  9. Deutsch A, Fernandez M, Florescu D, Levy A, Suciu D (1999) XML-QL: A query language for XML. In: Proceedings of the WWW conference, Toronto

  10. Diao Y, Fisher P, Franklin M, To R (2002) YFilter: efficient and scalable filtering of XML documents. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, 26 February-1 March 2002

  11. Diao Y, Franklin MJ (2003) Query processing for high-volume XML message brokering. In: Proceedings of the 29th international conference on very large databases, Berlin, 9-12 September 2003

  12. Eisenberg A, Melton J (2002) SQL/XML is making good progress. SIGMOD Rec 31(2):101-108

    Google Scholar 

  13. Florescu D, Hillary C, Kossmann D, Lucas P, Riccardi F, Westmann T, Carey M, Sundararajan A, Agrawal G (2003) The BEA/XQRL streaming XQuery processor. In: Proceedings of the 29th international conference on very large data bases, Berlin, 9-12 September 2003

  14. Green T, Miklau G, Onizuka M, Suciu D (2003) Processing XML streams with deterministic automata. In: Proceedings of the 19th international conference on data engineering, Bangalore, India, 5-8 March 2003

  15. Gupta A, Suciu D (2003) Stream processing of XPath queries with predicates. In: Proceedings of ACM SIGMOD, San Jose, CA, 23-26 May 2003

  16. Ives Z, Levy AY, Weld DS (2000) Efficient evaluation of regular path expressions on streaming XML data. Technical Report UW-CSE-2000-05-02, University of Washington

  17. Ives Z, Halevy AY, Weld DS (2002) An XML query engine for network-bound data. J Very Large Databases 11(4):380-402

    Google Scholar 

  18. Ludäscher B, Mukhopadhyay P, Papakonstantinou Y (2002) A transducer-based XML query processor. In: Proceedings of the 28th international conference on very large data bases, Hong Kong, September 2002

  19. Olteanu D, Kiesling T, Bry F (2003) An evaluation of regular path expressions with qualifiers against XML streams. In: Proceedings of the 19th international conference on data engineering, Bangalore, India, 5-8 March 2003

  20. Peng F, Chawathe S (2003) XPath queries on streaming data. In: Proceedings of ACM SIGMOD, San Jose, CA, 23-26 May 2003

  21. Viglas E, Naughton JF (2002) Rate-based query optimization for streaming information sources. In: Proceedings of ACM SIGMOD, Madison, WI, 3-6 June 2002

  22. Xalan-C++, an XSLT processor. Apache XML project. http://xml.apache.org/xalan-c/index.html

  23. Xerces-C++, a validating XML parser. Apache XML Project. http://xml.apache.org/xerces-c/index.html

  24. Zhang C, Naughton JF, DeWitt DJ, Luo Q (2001) On supporting containment queries in relational database management systems. In: Lohman GM (ed) Proceedings of ACM SIGMOD, Santa Barbara, CA, 21-24 May 2001

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vanja Josifovski.

Additional information

Received: 30 January 2003, Accepted: 4 February 2004, Published online: 8 April 2004

Edited by: R. Baeza-Yates.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Josifovski, V., Fontoura, M. & Barta, A. Querying XML streams. The VLDB Journal 14, 197–210 (2005). https://doi.org/10.1007/s00778-004-0123-7

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-004-0123-7

Keywords

Navigation