Skip to main content
Log in

Unleashing XQuery for Data-Independent Programming

  • Fachbeitrag
  • Published:
Datenbank-Spektrum Aims and scope Submit manuscript

Abstract

The XQuery language was initially developed as an SQL equivalent for XML data, but its roots in functional programming make it also a perfect choice for processing almost any kind of structured and semi-structured data. Apart from standard XML processing, however, advanced language features make it hard to efficiently implement the complete language for large data volumes. This work proposes a novel compilation strategy that provides both flexibility and efficiency to unleash XQuery’s potential as data programming language. It combines the simplicity and versatility of a storage-independent data abstraction with the scalability advantages of set-oriented processing. Expensive iterative sections in a query are unrolled to a pipeline of relational-style operators, which is open for optimized join processing, index use, and parallelization. The remaining aspects of the language are processed in a standard fashion, yet can be compiled anytime to more efficient native operations of the actual runtime environment. This hybrid compilation mechanism yields an efficient and highly flexible query engine that is able to drive any computation from simple XML transformation to complex data analysis, even on non-XML data. Experiments with our prototype and state-of-the-art competitors in classic XML query processing and business analytics over relational data attest the generality and efficiency of the design.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. New in XQuery 3.0: http://www.w3.org/TR/xquery-30/

  2. Note, we explicitly distinguish between iterators to iterate over the items of an XQuery sequence, and cursors to iterate over streams of context tuples.

  3. The aggregation specification for non-grouping variables of a GroupBy operator is represented as comma-separated list of XQuery-like expressions in the subscript. The specification $c:($c), for example, says that variable $c is aggregated using the sequence constructor (). The asterisk serves as wildcard for specifying a default aggregation expression.

  4. Source code available at http://brackit.org

References

  1. Draper D, Dyck M, Fankhauser P, Fernández MF, Malhotra A, Rose K, Rys M, Siméon J, Wadler P (2010) XQuery 1.0 and XPath 2.0 Formal semantics (2nd Edition)–W3C recommendation 14 December 2010. http://www.w3.org/TR/xquery-semantics/. Accessed 12 May 2014

  2. Bamford R, Borkar VR, Brantner M, Fischer PM, Florescu D, Graf DA, Kossmann D, Kraska T, Muresan D, Nasoi S, Zacharioudaki M (2009) XQuery reloaded. PVLDB 2(2):1342–1353

    Google Scholar 

  3. Kay M (2008) Ten reasons Wwhy Saxon XQuery is fast. IEEE Data Eng Bull 31(4):65–74

    MathSciNet  Google Scholar 

  4. Meier W (2003) eXist: An open source native XML database. WWsDS 2593:169–183. http://link.springer.com/chapter/10.1007%2F3-540-36560-5_13

  5. Grün C (2010) Storing and querying large XML instances. Dissertation, University of Konstanz

  6. Mathis M (2009) Storing, indexing, and querying XML documents in native database management systems. Dissertation, TU Kaiserslautern

  7. May N, Helmer S, Moerkotte G (2004) Nested queries and quantifiers in an ordered context. ICDE 239-250

  8. Grust T, Rittinger J, Teubner J (2008) Pathfinder: XQuery off the relational shelf. IEEE Data Eng Bull 31(4):7–14

    Google Scholar 

  9. Robie J, Chamberlin D, Dyck M, Snelson J (2010) XQuery 3.0: An XML query language – W3C working draft 14 December 2010. http://www.w3.org/TR/xquery-30/. Accessed 12 May 2014

  10. Graefe G (1993) Query evaluation techniques for large databases. ACM Comput Surv 25(2):73–170

    Article  Google Scholar 

  11. Weiner AM (2011) Advanced cardinality estimation in the XML query graph model. BTW 180:207–226. http://www.bibsonomy.org/bibtex/2de252266ade0e6d8c56ddcd5bfdf5729/dblp

  12. Bruno N, Koudas N, Srivastava D (2002) Holistic twig joins: optimal XML pattern matching. SIGMOD 310-321

  13. Abelson H, Sussman GJ, Sussman J (1985) Structure and interpretation of computer programs. MIT Press Cambridge, MA, USA. http://dl.acm.org/citation.cfm?id=26777

  14. Peyton Jones SL, Lester DR (1992) Implementing functional languages: a tutorial. Prentice Hall, New York

  15. Re C, Siméon J, Fernández MF (2006) A complete and efficient algebraic compiler for XQuery. IEEE Computer Society ICDE 14. http://dl.acm.org/citation.cfm?id=1129874

  16. Bächle S (2013) Separating key concerns in query processing – set-orientation, physical data independence, and parallelism. Dissertation, TU Kaiserslautern

  17. Mathis M, Härder T, Schmidt K (2009) Storing and indexing XML documents upside down. CSRD 24(1-2):51–68

    Google Scholar 

  18. Sauer C, Bächle S, Härder T (2013) Versatile XQuery processing in MapReduce. ADBIS 8133:204–217

    Google Scholar 

  19. Boncz PA, Grust T, van Keulen M, Manegold S, Rittinger J, Teubner J (2006) MonetDB/XQuery: a fast XQuery processor powered by a relational engine. SIGMOD 479-490

  20. Grust T, Mayr M, Rittinger J (2009) XQuery join graph isolation: celebrating 30+ years of XQuery processing technology. ICDE 1167-1170

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Bächle.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bächle, S., Sauer, C. Unleashing XQuery for Data-Independent Programming. Datenbank Spektrum 14, 135–150 (2014). https://doi.org/10.1007/s13222-014-0160-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-014-0160-3

Keywords

Navigation