Abstract
Over the years, we have seen a significant number of integration techniques for data warehouses to support web integrated data. However, the existing works focus extensively on the design concept. In this paper, we focus on the performance of a web database application such as an integrated web data warehousing using a well-defined and uniform structure to deal with web information sources including semi-structured data such as XML data, and documents such as HTML in a web data warehouse system. By using a case study, our implementation of the prototype is a web manipulation concept for both incoming sources and result outputs. Thus, the system not only can be operated through the web, it can also handle the integration of web data sources and structured data sources. Our main contribution is the performance evaluation of an integrated web data warehouse application which includes two tasks. Task one is to perform a verification of the correctness of integrated data based on the result set that is retrieved from the web integrated data warehouse system using complex and OLAP queries. The result set is checked against the result set that is retrieved from the existing independent data source systems. Task two is to measure the performance of OLAP or complex query by investigating source operation functions used by these queries to retrieve the data. The information of source operation functions used by each query is obtained using the TKPROF utility.
Similar content being viewed by others
References
Bishay, L., Taniar, D., Jiang, Y., Rahayu, W.: Structured web pages management for efficient data retrieval. In: Proceedings of the 1st International Conference on Web Information Systems Engineering (WISE ’00), Hong Kong, China, pp. 97–104. IEEE Computer Society (2000)
Bonifati, A., Cattaneo, F., Ceri, S., Fuggetta, A., Paraboschi, S.: Designing data marts for data warehouses. In: ACM Transactions on Software Engineering and Methodology (TOSEM), 2001, pp. 452–481
Breitbart, Y., Olson, Y., Thompson, G.: Database integration in a distributed heterogeneous data system. In: Proceedings of the 2nd IEEE International Conference on Data Engineering, 1986, pp. 301–310
Buzydlowski, W.J.: A framework for object oriented on-line analytic processing. In: Proceedings of the 1st ACM International Workshop on Data Warehousing and OLAP (DOLAP), 1998, pp. 10–15
Byung, P., Han, H., Song, Y.: XML-OLAP: a multidimensional analysis framework for XML warehouses. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWak ’05), 2005, pp. 32–42
Cabibbo, L., Torlone, A.R.: A logical approach to multidimensional databases. In: Proceedings of the 6th International Conference on Extending Database Technology, Advances in Database Technology, 1998, pp. 183–197
Calvanese, D., Giacomo, De.G., Lenzerini, M., Rosati, N.D.: Source integration in data warehouse. In: Proceedings of the 9th International Workshop on Database and Expert Systems Applications (DEXA ‘98), 1998, pp. 192–197
Chen, W., Hong, T., Lin, W.W.: Using the compressed data model in object-oriented data warehousing. In: Proceedings of IEEE International Conference on Systems, Man, Cybernetics (IEEE SMC ’99), 1999, pp. 768–772
Ezeife, I.C., Ohanekwu, E.T.: The use of smart tokens in cleaning integrated warehouse data. Int. J. Data Warehous. Min. 1(2), 1–22 (2005)
Le, D.X., Rahayu, J.W.: A dynamic approach for integrating web data warehouses. In: Proceedings of International Conference on Computational Science and Its Application (ICCSA ’06), pp. 207–216. Springer-Verlag, Berlin/Heidelberg (2006)
Filho, H.A., Prado, H.A., Toscani, S.S.: Evolving a legacy data warehouse system to an object oriented architecture. In: Proceedings of the XX International Conference of the Chilean Computer Science Society (SCCC ’00), Santiago, Chile, pp. 32–40. IEEE-Computer Society (2000)
Gopalkrishman, V., Li, Q., Karlapalem, K.: Issues of object relational view design in data warehousing environment. In: Proceedings of the IEEE Conference on Systems Man and Cybernetics (SMC ’98), 1998, pp. 2732–2737
Golfarelli, M., Rizzi, S., Birdoljak, B.: A conceptual design of data warehouses from E/R schema. In: Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences (HICSS ’98), Kohala Coast, Hawaii, USA, pp. 334–344. IEEE Computer Society (1998)
Golfarelli, M., Rizzi, S., Birdoljak, B.: Data warehousing from XML sources. In: Proceedings of the 4th ACM International Workshop on Data Warehousing and OLAP (DOLAP ’01), Georgia, USA, pp. 40–47. ACM Press (2001)
Gupta, A., Mumick, I.S.: Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng. Bull. 18(2), 3–18 (1995)
Hammer, J., Garcia-Molina, H., Widom, J., Labio, W., Zhuge, Y.: The stanford data warehousing project. IEEE Data Eng. Bull. 18(2), 40–47 (1995)
Huang, M.S., Su, H.C.: The development of an XML-based data warehouse system. In: Proceedings of the 3rd International Conference on Intelligent Data Engineering and Automated Learning, pp. 206–212. Springer-Verlag, Berlin/Heidelberg (2002)
Huynh, N., Mangisengi, O., Tjoa, M.A.: Metadata for object relational data warehouse. In: Proceedings of the Second Intl. Workshop on Design and Management of Data Warehouses (DMDW ’00), 2000, pp. 3-1–3-9
Hummer, W., Bauer, A., Harde, G.: XCube—XML for data warehouses. In: ACM 6th International Workshop on Data Warehousing and OLAP (DOLAP ’03), 2003, pp. 33–44
Jensen, M., Moller, T., Pedersen, T.: Specifying OLAP cubes on XML data. J. Int. Inf. Syst. 17(3), 101–112 (2001)
Loney, K., Koch, G.: Oracle 9i: The Complete Reference, Osborne. McGraw-Hill, Berkeley (2000)
Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (ACM PODS ’02), 2002, pp. 233–246
Li, S., Liu, M., Wang, G., Peng, Z.: Capturing semantic hierarchies to perform meaningful integration in HTML tables. In: Proceedings of the 6th Asia-Pacific Web Conference on Advanced Web Technologies and Applications (APWeb ‘04), 2004, pp. 899–902
Melton, J. (ed.): Information technology—database languages—SQL—Part 14: XML-related specifications (SQL/XML). ISO/IEC 9075-14 (2003)
Mohamah, S., Rahayu, W., Dillon, T.: Object relational star schemas. In: Proceeding of the 13th International Conference on Parallel and Distributed Computing and Systems (PDCS ’01), Anaheim, California. ACTA Press (2001)
Miller, L.L., Honavar, V., Wong, J., Nilakanta, S.: Object-oriented data warehouse for information fusion from heterogeneous distributed data and knowledge sources. In: IEEE Information Technology, 1998, pp. 27–30
Nassis, V., Rahayu, W., Rajugan, R., Dillon, T.: Conceptual design of XML document warehouses. In: Proceeding of the 6th International on Data Warehousing and Knowledge Discovery (DaWak, ‘04), 2004, pp. 1–14
Nassis, V., Rajagopalapillai, R., Dillon, S.T., Rahayu, W.: Conceptual and systematic design approach for XML document warehouses Int. J. Data Warehous. Min. 1(3), 63–87 (2005)
Nummenmaa, J., Niemi, T., Niinimäki, M., Thanisch, P.: Constructing an OLAP cube on XML data. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP (DOLAP ’02), 2002, pp. 22–27
Pardede, E., Rahayu, W.J., Taniar, D.: On using collection for aggregation and association relationships in XML object relational storage. In: Proceedings of the 2004 ACM Symposium on Applied Computing (SAC ’04), pp. 703–710. ACM Press, New York (2004)
Pardede, E., Rahayu, W.J., Taniar, D.: Preserving conceptual constraints during XML updates. Int. J. Web Inf. Syst. 1(2), 65–82 (2005)
Rahayu, W.J., Chang, E., Dillon, S.T., Taniar, D.: A methodology of transforming inheritance relationships in an object-oriented conceptual model to relational tables. Inf. Softw. Technol. J. 42(8), 571–592 (2000)
Rahayu, J.W.: Object relational transformation. PhD Thesis of Computer Science and Computer Engineering, La Trobe University, Melbourne (1999)
Rusu, I.L., Rahayu, W.J., Taniar, D.: On building XML data warehouses. In: Intelligent Data Engineering and Automated Learning, (IDEAL), LNCS vol. 3177/2004, pp. 293–299. Springer-Verlag, Berlin/Heidelberg (2004)
Rusu, I.L., Rahayu, W.J., Taniar, D.: Methodology for building XML data warehouses. Int. J. Data Warehous. Min. 1(2), 23–48 (2005)
Serrano, M., Calero, C., Piattini, M.: An experimental replication with data warehouse metrics. Int. J. Data Warehous. Min. 1(4), 1–21 (2005)
Taniar, D., Rahayu, W., Srivastava, P.: A taxonomy for object-relational queries, effective database for text & document management. In: Becker, S.A. (ed.) Effective Database For Text and Document Management, pp. 183–220. IDEA Group Publishing, USA (2003)
Widom, J.: Research problem in data warehouse. In: Proceedings of the 4th International Conference on Information and Knowledge Management, 1995, pp. 25–30
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article can be found at http://dx.doi.org/10.1007/s10586-007-0024-9
Rights and permissions
About this article
Cite this article
Dung, X.T., Rahayu, W. & Taniar, D. A high performance integrated web data warehousing. Cluster Comput 10, 95–109 (2007). https://doi.org/10.1007/s10586-007-0008-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-007-0008-9