ABSTRACT
We are addressing the efficient processing of continuous XML streams, in which the server broadcasts XML data to multiple clients concurrently through a multicast data stream, while each client is fully responsible for processing the stream. In our framework, a server may disseminate XML fragments from multiple documents in the same stream, can repeat or replace fragments, and can introduce new fragments or delete invalid ones. A client uses a light-weight database based on our proposed XML algebra to cache stream data and to evaluate XML queries against these data. The synchronization between clients and servers is achieved through annotations and punctuations transmitted along with the data streams. We are presenting a framework for processing XML queries in XQuery form over continuous XML streams. Our framework is based on a novel XML algebra and a new algebraic optimization framework based on query decorrelation, which is essential for non-blocking stream processing.
- S. Acharya, R. Alonso, M. Franklin, and S. Zdonik. Broadcast Disks: Data Management for Asymmetric Communications Environments. In ACM SIGMOD International Conference on Management of Data, San Jose, California, pages 199--210, May 1995.]] Google ScholarDigital Library
- S. Babu and J. Widom. Continuous Queries Over Data Streams. SIGMOD Record, 30(3):109--120, September 2001.]] Google ScholarDigital Library
- C. Beeri and Y. Tzaban. SAL: An Algebra for Semistructured Data and XML. In ACM SIGMOD Workshop on The Web and Databases (WebDB'99), Philadelphia, Pennsylvania, pages 37--42, June 1999.]]Google Scholar
- P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A Query Language and Optimization Techniques for Unstructured Data. In ACM SIGMOD International Conference on Management of Data, Montreal, Canada, pages 505--516, May 1996.]] Google ScholarDigital Library
- P. Buneman, L. Libkin, D. Suciu, V. Tannen, and L. Wong. Comprehension Syntax. SIGMOD Record, 23(1):87--96, March 1994.]] Google ScholarDigital Library
- D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery: A Query Language for XML. Available at http://www.w3.org/TR/xquery/, 2000.]]Google Scholar
- V. Christophides, S. Cluet, and J. Siméon. On Wrapping Query Languages and Efficient XML Integration. In ACM SIGMOD International Conference on Management of Data, Dallas, Texas, pages 141--152, May 2000.]] Google ScholarDigital Library
- S. Cluet, C. Delobel, J. Simeon, and K. Smaga. Your Mediators Need Data Conversion! In ACM SIGMOD International Conference on Management of Data, Seattle, Washington, pages 177--188, June 1998.]] Google ScholarDigital Library
- S. Cluet and G. Moerkotte. Nested Queries in Object Bases. In Workshop on Database Programming Languages, Gubbio, Italy, September 1995.]] Google ScholarDigital Library
- L. Fegaras and R. Elmasri. Query Engines for Web-Accessible XML Data. In VLDB Conference, Roma, Italy, pages 251--260, 2001.]] Google ScholarDigital Library
- L. Fegaras and D. Maier. Optimizing Object Queries Using an Effective Calculus. ACM Transactions on Database Systems, 25(4):457--516, December 2000.]] Google ScholarDigital Library
- M. Fernandez, J. Simeon, and P. Wadler. An Algebra for XML Query. In FST TCS, Delhi, December 2000.]] Google ScholarDigital Library
- D. Florescu, A. Levy, and A. Mendelzon. Database Techniques for the World-Wide Web: A Survey. SIGMOD Record, 27(3):59--74, 1998.]] Google ScholarDigital Library
- J. Hellerstein, P. Haas, and H. Wang. Online Aggregation. In ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, pages 171--182, May 1997.]] Google ScholarDigital Library
- Z. Ives, A. Levy, and D. Weld. Efficient Evaluation of Regular Path Expressions on Streaming XML Data. Technical report, University of Washington, 2000. Technical Report UW-CSE-2000-05-02.]]Google Scholar
- A. Wilschut and P. Apers. Dataflow Query Execution in a Parallel Main-Memory Environment. In First International Conference on Parallel and Distributed Information Systems, Miami Beach, Florida, pages 68--77, December 1991.]] Google ScholarDigital Library
- World Wide Web Consortium (W3C). Extensible Markup Language (XML). http://www.w3.org/XML/.]]Google Scholar
Index Terms
Query processing of streamed XML data
Recommendations
XLight, An Efficient Relational Schema to Store and Query XML Data
DSDE '10: Proceedings of the 2010 International Conference on Data Storage and Data EngineeringBecause of increasing use of XML data on the internet, the need for an efficient method of storing and querying XML data is vital. So far, two major types of system for XML data management have been introduced: XML Enabled systems and XML native ...
XML Processing and Data Integration with XQuery
Most Web applications exchange data as XML, but they create and process this data with languages that don't have native support for XML. With appropriate middleware, XQuery can dramatically simplify this process, treating all data sources as though they ...
An XML query engine for network-bound data
XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all ...
Comments