Unleashing XQuery for Data-Independent Programming

Bächle, Sebastian; Sauer, Caetano

doi:10.1007/s13222-014-0160-3

Unleashing XQuery for Data-Independent Programming

Fachbeitrag
Published: 17 June 2014

Volume 14, pages 135–150, (2014)
Cite this article

Datenbank-Spektrum Aims and scope Submit manuscript

Sebastian Bächle¹ &
Caetano Sauer¹

105 Accesses
2 Citations
Explore all metrics

Abstract

The XQuery language was initially developed as an SQL equivalent for XML data, but its roots in functional programming make it also a perfect choice for processing almost any kind of structured and semi-structured data. Apart from standard XML processing, however, advanced language features make it hard to efficiently implement the complete language for large data volumes. This work proposes a novel compilation strategy that provides both flexibility and efficiency to unleash XQuery’s potential as data programming language. It combines the simplicity and versatility of a storage-independent data abstraction with the scalability advantages of set-oriented processing. Expensive iterative sections in a query are unrolled to a pipeline of relational-style operators, which is open for optimized join processing, index use, and parallelization. The remaining aspects of the language are processed in a standard fashion, yet can be compiled anytime to more efficient native operations of the actual runtime environment. This hybrid compilation mechanism yields an efficient and highly flexible query engine that is able to drive any computation from simple XML transformation to complex data analysis, even on non-XML data. Experiments with our prototype and state-of-the-art competitors in classic XML query processing and business analytics over relational data attest the generality and efficiency of the design.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

New in XQuery 3.0: http://www.w3.org/TR/xquery-30/
Note, we explicitly distinguish between iterators to iterate over the items of an XQuery sequence, and cursors to iterate over streams of context tuples.
The aggregation specification for non-grouping variables of a GroupBy operator is represented as comma-separated list of XQuery-like expressions in the subscript. The specification $c:($c), for example, says that variable $c is aggregated using the sequence constructor (). The asterisk serves as wildcard for specifying a default aggregation expression.
Source code available at http://brackit.org

References

Draper D, Dyck M, Fankhauser P, Fernández MF, Malhotra A, Rose K, Rys M, Siméon J, Wadler P (2010) XQuery 1.0 and XPath 2.0 Formal semantics (2nd Edition)–W3C recommendation 14 December 2010. http://www.w3.org/TR/xquery-semantics/. Accessed 12 May 2014
Bamford R, Borkar VR, Brantner M, Fischer PM, Florescu D, Graf DA, Kossmann D, Kraska T, Muresan D, Nasoi S, Zacharioudaki M (2009) XQuery reloaded. PVLDB 2(2):1342–1353
Google Scholar
Kay M (2008) Ten reasons Wwhy Saxon XQuery is fast. IEEE Data Eng Bull 31(4):65–74
MathSciNet Google Scholar
Meier W (2003) eXist: An open source native XML database. WWsDS 2593:169–183. http://link.springer.com/chapter/10.1007%2F3-540-36560-5_13
Grün C (2010) Storing and querying large XML instances. Dissertation, University of Konstanz
Mathis M (2009) Storing, indexing, and querying XML documents in native database management systems. Dissertation, TU Kaiserslautern
May N, Helmer S, Moerkotte G (2004) Nested queries and quantifiers in an ordered context. ICDE 239-250
Grust T, Rittinger J, Teubner J (2008) Pathfinder: XQuery off the relational shelf. IEEE Data Eng Bull 31(4):7–14
Google Scholar
Robie J, Chamberlin D, Dyck M, Snelson J (2010) XQuery 3.0: An XML query language – W3C working draft 14 December 2010. http://www.w3.org/TR/xquery-30/. Accessed 12 May 2014
Graefe G (1993) Query evaluation techniques for large databases. ACM Comput Surv 25(2):73–170
Article Google Scholar
Weiner AM (2011) Advanced cardinality estimation in the XML query graph model. BTW 180:207–226. http://www.bibsonomy.org/bibtex/2de252266ade0e6d8c56ddcd5bfdf5729/dblp
Bruno N, Koudas N, Srivastava D (2002) Holistic twig joins: optimal XML pattern matching. SIGMOD 310-321
Abelson H, Sussman GJ, Sussman J (1985) Structure and interpretation of computer programs. MIT Press Cambridge, MA, USA. http://dl.acm.org/citation.cfm?id=26777
Peyton Jones SL, Lester DR (1992) Implementing functional languages: a tutorial. Prentice Hall, New York
Re C, Siméon J, Fernández MF (2006) A complete and efficient algebraic compiler for XQuery. IEEE Computer Society ICDE 14. http://dl.acm.org/citation.cfm?id=1129874
Bächle S (2013) Separating key concerns in query processing – set-orientation, physical data independence, and parallelism. Dissertation, TU Kaiserslautern
Mathis M, Härder T, Schmidt K (2009) Storing and indexing XML documents upside down. CSRD 24(1-2):51–68
Google Scholar
Sauer C, Bächle S, Härder T (2013) Versatile XQuery processing in MapReduce. ADBIS 8133:204–217
Google Scholar
Boncz PA, Grust T, van Keulen M, Manegold S, Rittinger J, Teubner J (2006) MonetDB/XQuery: a fast XQuery processor powered by a relational engine. SIGMOD 479-490
Grust T, Mayr M, Rittinger J (2009) XQuery join graph isolation: celebrating 30+ years of XQuery processing technology. ICDE 1167-1170

Download references

Author information

Authors and Affiliations

TU Kaiserslautern, P. O. Box 3049, 67653, Kaiserslautern, Germany
Sebastian Bächle & Caetano Sauer

Authors

Sebastian Bächle
View author publications
You can also search for this author in PubMed Google Scholar
Caetano Sauer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Bächle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bächle, S., Sauer, C. Unleashing XQuery for Data-Independent Programming. Datenbank Spektrum 14, 135–150 (2014). https://doi.org/10.1007/s13222-014-0160-3

Download citation

Received: 07 April 2014
Accepted: 13 May 2014
Published: 17 June 2014
Issue Date: July 2014
DOI: https://doi.org/10.1007/s13222-014-0160-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unleashing XQuery for Data-Independent Programming

Abstract

Access this article

Similar content being viewed by others

DB-GPT: Large Language Model Meets Database

MongoDB Vs PostgreSQL: A comparative study on performance aspects

A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unleashing XQuery for Data-Independent Programming

Abstract

Access this article

Similar content being viewed by others

DB-GPT: Large Language Model Meets Database

MongoDB Vs PostgreSQL: A comparative study on performance aspects

A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation