Skip to main content
Log in

Storing and querying XML data using denormalized relational databases

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

XML database systems emerge as a result of the acceptance of the XML data model. Recent works have followed the promising approach of building XML database management systems on underlying RDBMS’s. Achieving query processing performance reduces to two questions: (i) How should the XML data be decomposed into data that are stored in the RDBMS? (ii) How should the XML query be translated into an efficient plan that sends one or more SQL queries to the underlying RDBMS and combines the data into the XML result? We provide a formal framework for XML Schema-driven decompositions, which encompasses the decompositions proposed in prior work and extends them with decompositions that employ denormalized tables and binary-coded XML fragments. We provide corresponding query processing algorithms that translate the XML query conditions into conditions on the relational tables and assemble the decomposed data into the XML query result. Our key performance focus is the response time for delivering the first results of a query. The most effective of the described decompositions have been implemented in XCacheDB, an XML DBMS built on top of a commercial RDBMS, which serves as our experimental basis. We present experiments and analysis that point to a class of decompositions, called inlined decompositions, that improve query performance for full results and first results, without significant increase in the size of the database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Apache Software Foundation. Xindice. http://xml.apache.org/xindice/.

  2. F. Bancilhon, C. Delobel, P. Kanellakis (1992) Building an object-oriented database system : the story of O2. Morgan-Kaufmann

  3. P. Bohannon, J. Freire, P. Roy, J. Siméon (2002) From XML schema to relations: A cost-based approach to XML storage. In: Proceedings of the 18th International Conference on Data Engineering, 26 February - 1 March 2002, San Jose, California, USA, p. 64. IEEE Computer Society

  4. S. Banerjee, V. Krishnamurthy, M. Krishnaprasad, R. Murthy (2000) Oracle8i - the XML enabled data management system. In: ICDE 2000, Proceedings of the 16th International Conference on Data Engineering, pp. 561-568. IEEE Computer Society

  5. A. Balmin, Y. Papakonstantinou, K. Stathatos, V. Vassalos (2000) System for querying markup language data stored in a relational database according to markup language schema. Submitted by Enosys Software Inc. to USPTO.

  6. Coherity. Coherity XML database (CXD). http://www.coherity.com.

  7. Transaction Processing Performance Council (1999) TPC benchmark H. Standard specification available at http://www.tpc.org.

  8. A. Deutsch, M.F. Fernandez, D. Suciu (1999) Storing semistructured data with STORED. In: SIGMOD 1999, Proceedings ACM SIGMOD International Conference on Management of Data, June 1-3, 1999, Philadephia, Pennsylvania, USA. ACM Press

  9. Ellipsis. DOM-Safe. http://www.ellipsis.nl.

  10. eXcelon Corp. eXtensible information server (XIS). http://www.exln.com.

  11. D. Florescu, D. Kossmann (1999) Storing and querying XML data using an RDBMS. IEEE Data Engineering Bulletin 22(3):27-34

    Google Scholar 

  12. M.F. Fernandez, A. Morishima, D. Suciu (2001) Efficient evaluation of XML middle-ware queries. In: Proceedings of ACM SIGMOD International Conference on Management of Data, May 21-24, Santa Barbara, California, USA, pp. 103-114. ACM Press

  13. M.F. Fernandez, W.C. Tan, D. Suciu (2000) SilkRoute: trading between relations and XML. In: WWW9 / Computer Networks, pp. 723-745

  14. H. Garcia-Molina, J. Ullman, J. Widom (1999) Principles of Database Systems. Prentice Hall

  15. R. Goldman, J. McHugh, J. Widom (1999) From semistructured data to XML: Migrating the lore data model and query language. In: WebDB (Informal Proceedings), pp. 25-30

  16. H.V. Jagadish, S. Al-Khalifa, A. Chapman, L.V.S. Lakshmanan, A. Nierman, S. Paparizos, J. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, C. Yu (2002) TIMBER: A native XML database. VLDB 11(4)

  17. Infonyte. Infonyte DB. http://www.infonyte.com.

  18. Ipedo. Ipedo XML DB. http://www.ipedo.com.

  19. J. Xu, J.M. Cheng (2000) XML and DB2. In: ICDE 2000, Proceedings of the 16th International Conference on Data Engineering, 28 February - 3 March 2000, San Diego, California, USA, pp. 569-573. IEEE Computer Society

  20. A. Kemper, G. Moerkotte (1992) Access Support Relations: An Indexing Method for Object Bases. IS 17(2), 117-145

    Google Scholar 

  21. A. Krupnikov. DBDOM. http://dbdom.sourceforge.net/.

  22. Q. Li, B. Moon (2001) Indexing and querying xml data for regular path expressions. In: VLDB 2001, Proceedings of the 27th International Conference on Very Large Databases, September 1-14, 2001, Roma, Italy, pp. 361-370. Morgan Kaufmann

  23. I. Manolescu, D. Florescu, D. Kossmann, F. Xhumari, D. Olteanu (2000) Agora: Living with XML and relational. In: VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10-14, 2000, Cairo, Egypt, pp. 623-626. Morgan Kaufmann

  24. M/Gateway Developments Ltd. eXtc. http://www.mgateway.tzo.com/eXtc.htm.

  25. J.F. Naughton, D.J. DeWitt, D. Maier, A. Aboulnaga, J. Chen, L. Galanis, J. Kang, R. Krishnamurthy, Q. Luo, N. Prakash, R. Ramamurthy, J. Shanmugasundaram, F. Tian, K. Tufte, S. Viglas, Y. Wang, C. Zhang, B. Jackson, A. Gupta, R. Chen (2001) The Niagara internet query system. In: IEEE Data Engineering Bulletin 24(2), pp. 27-33

  26. NeoCore. Neocore XML management system. http://www.neocore.com.

  27. OpenLink Software. Virtuoso. http://www.openlinksw.com/virtuoso/.

  28. M. Tamer Özsu, P. Valduriez (1999) Principles of distributed database systems. Prentice Hall

  29. Y. Papakonstantinou, V. Vianu (2000) DTD inference for views of XML data. In: Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, May 15-17, 2000, Dallas, Texas, USA, pp. 35-46. ACM

  30. Y. Papakonstantinou, V. Vassalos (2001) The Enosys Markets data integration platform: Lessons from the trenches. Proceedings of the 2001 ACM CIKM International Conference on Information and Knowledge Management, Atlanta, Georgia, USA, November 5-10, 2001, pp. 538-540

    Google Scholar 

  31. M. Rys (2001) State-of-the-art XML support in RDBMS: Microsoft SQL server’s XML features. In: IEEE Data Engineering Bulletin 24(2), pp. 3-11

  32. A. Schmidt, M.L. Kersten, M. Windhouwer, F. Waas (2001) Efficient relational storage and retrieval of XML documents. In Dan Suciu and Gottfried Vossen, editors, WebDB (Selected Papers), volume 1997 of Lecture Notes in Computer Science, pp. 47-52. Springer

  33. J. Shanmugasundaram, E.J. Shekita, R. Barr, M.J. Carey, B.G. Lindsay, H. Pirahesh, B. Reinwald (2000) Efficiently publishing relational data as XML documents. In: VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10-14, 2000, Cairo, Egypt, pp. 65-76. Morgan Kaufmann

  34. J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D.J. DeWitt, J.F. Naughton (1999) Relational databases for querying XML documents: Limitations and opportunities. In: VLDB’99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK, pp. 302-314. Morgan Kaufmann

  35. H. Schöning, J. Wäsch (2000) Tamino - an internet database system. In: EDBT 2000, Proceedings of the 7th International Conference on Extending Database Technology, pp. 383-387

  36. W3C. Document object model (DOM) (1998) W3C Recomendation at http://www.w3c.org/DOM/.

  37. W3C. The extensible markup language (XML) (1998) W3C Recomendation at http://www.w3c.org/XML.

  38. W3C. XML schema definition (2001) W3C Recomendation at http://www.w3c.org/XML/Schema.

  39. W3C. XQuery: A query language for XML (2001) W3C Working Draft at http://www.w3c.org/XML/Query.

  40. Wired Minds. MindSuite XDB. http://xdb.wiredminds.com/.

  41. X-Hive Corporation. X-Hive/DB. http://www.x-hive.com.

  42. XML Global. GoXML. http://www.xmlglobal.com.

  43. Y. Xu, Y. Papakonstantinou. XSearch demo. http://www.db.ucsd.edu/People/yu/xsearch/.

  44. XYZFind Corporation. XYZFind server. http://www.xyzfind.com.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrey Balmin.

Additional information

Received: 21 December 2001, Accepted: 1 July 2003, Published online: 23 June 2004

Edited by: A. Halevy

Andrey Balmin: Andrey Balmin has been supported by NSF IRI-9734548.

Yannis Papakonstantinou: The authors built the XCacheDB system while on leave at Enosys Software, Inc., during 2000.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balmin, A., Papakonstantinou, Y. Storing and querying XML data using denormalized relational databases. The VLDB Journal 14, 30–49 (2005). https://doi.org/10.1007/s00778-003-0113-1

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-003-0113-1

Keywords

Navigation