Abstract
Due to an explosive increase of XML documents, it is imperative to manage XML data in an XML data warehouse. XML warehousing imposes challenges, which are not found in the relational data warehouses. In this paper, we firstly present a framework to build an XML data warehouse schema. For the purpose of scalability due to the increase of data volume, we propose a number of partitioning techniques for multi-version XML data warehouses, including document based partitioning, schema based partitioning, and cascaded (mixed) partitioning model. Finally, we formulate cost models to evaluate various types of queries for an XML data warehouse.
Similar content being viewed by others
References
Abadi, D., Marcus, A., Madden, S.R., Hollenbach, K.: Scalable semantic web data management using vertical partitioning. In: Proceedings of the International Conference on Very Large Data Bases (VLDB’2007), pp. 411–422 (2007)
Bellatreche, L., Karlapalem, K., Mohania, M.: OLAP query processing for partitioned data warehouses. In: Proceedings of the International Symposium on Database Applications in Non-Traditional Environments, pp. 35–42 (1999)
Bellatreche, L., Karlapalem, K., Mohania, M., Schneider, M.: What can partitioning do for your data warehouses and data marts? In: Proceedings of International Symposium on Database Engineering and Applications, pp. 437–445 (2000)
Bellatreche, L., Boukhalfa, L.: An evolutionary approach to schema partitioning selection in a data warehouse. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWaK’2005). Lecture Notes in Computer Science, vol. 3589, pp. 115–125. Springer, Berlin (2005)
Chien, S.Y., Tzotras, V.J., Zaniolo, C., Zhang, D.: Storing and querying multiversion XML documents using durable node numbers. In: Proceedings of the International Conference on Web Information Systems Engineering (WISE’2001), pp. 232–241 (2001)
Cobena, G., Abiteboul, S., Marian, A.: Detecting changes in XML documents. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), pp. 41–52 (2002)
Dehne, F., Eavis, T., Rau-Chaplin, A.: RCUBE: parallel multi-dimensional ROLAP indexing. Int. J. Data Warehous. Min. IGI Glob. 4(3), 1–14 (2008)
Furtado, P.: Workload-based placement and join processing in node-partitioned data warehouses. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWaK’2004). Lecture Notes in Computer Science, vol. 3181, pp. 38–47. Springer, Berlin (2004)
Gorla, N., Pang, B.: Vertical fragmentation in databases using data-mining technique. Int. J. Data Warehous. Min. IGI Glob. 4(3), 33–53 (2008)
Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: 2001, Change-centric management of versions in an XML warehouse. In: Proceedings of the International Conference on Very Large Data Bases (VLDB’2001), pp. 581–590 (2001)
Pardede, E., Rahayu, J.W., Taniar, D.: Object-relational complex structures for XML storage. Inf. Softw. Technol. 48(6), 370–384 (2006)
Rusu, L.I., Rahayu, W., Taniar, D.: On data cleaning in building XML data warehouses. In: Proceedings of the 6th International Conference on Information Integration and Web-based Applications & Services (iiWAS’2004), pp. 797–807 (2004)
Rusu, L.I., Rahayu, W., Taniar, D.: A methodology for building XML data warehouses. Int. J. Data Warehous. Min. 1(2), 67–92 (2005)
Rusu, L.I., Rahayu, W., Taniar, D.: Maintaining versions of dynamic XML documents. In: Proceedings of the 6th International Conference on Web Information Systems Engineering (WISE’2005). Lecture Notes in Computer Science Lecture Notes in Computer Science, vol. 3806, pp. 536–543. Springer, Berlin (2005)
Rusu, L.I., Rahayu, W., Taniar, D.: Warehousing dynamic XML documents. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWaK’2006). Lecture Notes in Computer Science, vol. 4081, pp. 175–184. Springer, Berlin (2006)
Rusu, L.I., Rahayu, W., Taniar, D.: Storage techniques for multi-versioned XML documents. In: Proceedings of the 13th International Conference on Database Systems for Advanced Applications (DASFAA’2008). Lecture Notes in Computer Science, vol. 4947, pp. 538–545. Springer, Berlin (2008)
Taniar, D., Rahayu, J.W.: Parallel sort-merge object-oriented collection join algorithms. Int. J. Comput. Syst. Sci. Eng. 17(3), 145–158 (2002)
Taniar, D., Rahayu, J.W.: Parallel group-by query processing in a cluster architecture. Int. J. Comput. Syst. Sci. Eng. 17(1), 23–39 (2002)
Taniar, D., Leung, C.H.C.: Query execution scheduling in parallel object-oriented databases. Inf. Softw. Technol. 41(3), 163–178 (1999)
Taniar, D., Leung, C.H.C.: The impact of load balancing to object-oriented query execution scheduling in parallel machine environment. Inf. Sci. 157, 33–71 (2003)
Wang, F., Zaniolo, C.: Temporal queries in XML document archives and web warehouses. In: Proceedings of the 10th International Symposium on Temporal Representation and Reasoning/4th International Conference on Temporal Logic (TIME-ICTL 2003), pp. 47–55 (2003)
Xyleme, L.: A dynamic warehouse for XML data of the web. IEEE Data Eng. Bull. 24(2), 40–47 (2001)
Mahboubi, H., Darmont, J.: Data mining-based fragmentation of XML data warehouses. In: Proceedings of DOLAP 2008, pp. 9–16. ACM, New York (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Ladjel Bellatreche.
Rights and permissions
About this article
Cite this article
Rusu, L.I., Rahayu, W. & Taniar, D. Partitioning methods for multi-version XML data warehouses. Distrib Parallel Databases 25, 47–69 (2009). https://doi.org/10.1007/s10619-009-7034-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-009-7034-y