Abstract.
Efficient querying of XML streams will be one of the fundamental features of next-generation information systems. In this paper we propose the TurboXPath path processor, which accepts a language equivalent to a subset of the for-let-where constructs of XQuery over a single document. TurboXPath can be extended to provide full XQuery support or used to augment federated database engines for efficient handling of queries over XML data streams produced by external sources. Internally, TurboXPath uses a tree-shaped path expression with multiple outputs to drive the execution. The result of a query execution is a sequence of tuples of XML fragments matching the output nodes. Based on a streamed execution model, TurboXPath scales up to large documents and has limited memory consumption for increased concurrency. Experimental evaluation of a prototype demonstrates performance gains compared to other state-of-the-art path processors.
Similar content being viewed by others
References
Altinel M, Franklin M (2000) Efficient filtering of XML documents for selective dissemination of information. In: Proceedings of the 26th international conference on very large databases, Cairo, Egypt, 10-14 September 2000
Al-Khalifa S, Jagadish HV, Koudas N, Patel JM, Srivastava D, Wu Y (2002) Structural joins: a primitive for efficient XML query pattern matching. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, 26 February-1 March 2002
Barton C, Charles P, Goyal D, Raghavachari M, Fontoura M, Josifovski V (2003) Streaming XPath processing with forward and backward axes. In: Proceedings of the 19th international conference on data engineering, Bangalore, India, 5-8 March 2003
Bruno N, Srivastava D, Koudas N (2002) Holistic twig joins: optimal XML pattern matching. In: Proceedings of SIGMOD, Madison, WI, 3-6 June 2002
Chamberlin D, Clark J, Florescu D, Robie J, Simeon J, Stefanescu M (2003) XQuery 1.0: An XML query language, W3C Working Draft, August 2003. http://www.w3.org/TR/xquery
Chan C-Y, Felber P, Garofalakis M, Rastogi R (2002) Efficient filtering of XML documents with XPath expressions. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, 26 February-1 March 2002
Chen J, DeWitt DJ, Tian F, Wang Y (2000) NiagaraCQ: a scalable continuous query system for Internet databases. In: Proceedings of ACM SIGMOD, Dallas, TX, 15-18 May 2000
Clark J, DeRose S (1999) XML Path Language (XPath) Version 1.0. W3C Recommendation 16 November 1999. http://www.w3.org/TR/1999/REC-xpath-19991116
Deutsch A, Fernandez M, Florescu D, Levy A, Suciu D (1999) XML-QL: A query language for XML. In: Proceedings of the WWW conference, Toronto
Diao Y, Fisher P, Franklin M, To R (2002) YFilter: efficient and scalable filtering of XML documents. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, 26 February-1 March 2002
Diao Y, Franklin MJ (2003) Query processing for high-volume XML message brokering. In: Proceedings of the 29th international conference on very large databases, Berlin, 9-12 September 2003
Eisenberg A, Melton J (2002) SQL/XML is making good progress. SIGMOD Rec 31(2):101-108
Florescu D, Hillary C, Kossmann D, Lucas P, Riccardi F, Westmann T, Carey M, Sundararajan A, Agrawal G (2003) The BEA/XQRL streaming XQuery processor. In: Proceedings of the 29th international conference on very large data bases, Berlin, 9-12 September 2003
Green T, Miklau G, Onizuka M, Suciu D (2003) Processing XML streams with deterministic automata. In: Proceedings of the 19th international conference on data engineering, Bangalore, India, 5-8 March 2003
Gupta A, Suciu D (2003) Stream processing of XPath queries with predicates. In: Proceedings of ACM SIGMOD, San Jose, CA, 23-26 May 2003
Ives Z, Levy AY, Weld DS (2000) Efficient evaluation of regular path expressions on streaming XML data. Technical Report UW-CSE-2000-05-02, University of Washington
Ives Z, Halevy AY, Weld DS (2002) An XML query engine for network-bound data. J Very Large Databases 11(4):380-402
Ludäscher B, Mukhopadhyay P, Papakonstantinou Y (2002) A transducer-based XML query processor. In: Proceedings of the 28th international conference on very large data bases, Hong Kong, September 2002
Olteanu D, Kiesling T, Bry F (2003) An evaluation of regular path expressions with qualifiers against XML streams. In: Proceedings of the 19th international conference on data engineering, Bangalore, India, 5-8 March 2003
Peng F, Chawathe S (2003) XPath queries on streaming data. In: Proceedings of ACM SIGMOD, San Jose, CA, 23-26 May 2003
Viglas E, Naughton JF (2002) Rate-based query optimization for streaming information sources. In: Proceedings of ACM SIGMOD, Madison, WI, 3-6 June 2002
Xalan-C++, an XSLT processor. Apache XML project. http://xml.apache.org/xalan-c/index.html
Xerces-C++, a validating XML parser. Apache XML Project. http://xml.apache.org/xerces-c/index.html
Zhang C, Naughton JF, DeWitt DJ, Luo Q (2001) On supporting containment queries in relational database management systems. In: Lohman GM (ed) Proceedings of ACM SIGMOD, Santa Barbara, CA, 21-24 May 2001
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 30 January 2003, Accepted: 4 February 2004, Published online: 8 April 2004
Edited by: R. Baeza-Yates.
Rights and permissions
About this article
Cite this article
Josifovski, V., Fontoura, M. & Barta, A. Querying XML streams. The VLDB Journal 14, 197–210 (2005). https://doi.org/10.1007/s00778-004-0123-7
Issue Date:
DOI: https://doi.org/10.1007/s00778-004-0123-7