Skip to main content
Log in

A high performance integrated web data warehousing

  • Published:
Cluster Computing Aims and scope Submit manuscript

An Erratum to this article was published on 17 April 2007

Abstract

Over the years, we have seen a significant number of integration techniques for data warehouses to support web integrated data. However, the existing works focus extensively on the design concept. In this paper, we focus on the performance of a web database application such as an integrated web data warehousing using a well-defined and uniform structure to deal with web information sources including semi-structured data such as XML data, and documents such as HTML in a web data warehouse system. By using a case study, our implementation of the prototype is a web manipulation concept for both incoming sources and result outputs. Thus, the system not only can be operated through the web, it can also handle the integration of web data sources and structured data sources. Our main contribution is the performance evaluation of an integrated web data warehouse application which includes two tasks. Task one is to perform a verification of the correctness of integrated data based on the result set that is retrieved from the web integrated data warehouse system using complex and OLAP queries. The result set is checked against the result set that is retrieved from the existing independent data source systems. Task two is to measure the performance of OLAP or complex query by investigating source operation functions used by these queries to retrieve the data. The information of source operation functions used by each query is obtained using the TKPROF utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bishay, L., Taniar, D., Jiang, Y., Rahayu, W.: Structured web pages management for efficient data retrieval. In: Proceedings of the 1st International Conference on Web Information Systems Engineering (WISE ’00), Hong Kong, China, pp. 97–104. IEEE Computer Society (2000)

  2. Bonifati, A., Cattaneo, F., Ceri, S., Fuggetta, A., Paraboschi, S.: Designing data marts for data warehouses. In: ACM Transactions on Software Engineering and Methodology (TOSEM), 2001, pp. 452–481

  3. Breitbart, Y., Olson, Y., Thompson, G.: Database integration in a distributed heterogeneous data system. In: Proceedings of the 2nd IEEE International Conference on Data Engineering, 1986, pp. 301–310

  4. Buzydlowski, W.J.: A framework for object oriented on-line analytic processing. In: Proceedings of the 1st ACM International Workshop on Data Warehousing and OLAP (DOLAP), 1998, pp. 10–15

  5. Byung, P., Han, H., Song, Y.: XML-OLAP: a multidimensional analysis framework for XML warehouses. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWak ’05), 2005, pp. 32–42

  6. Cabibbo, L., Torlone, A.R.: A logical approach to multidimensional databases. In: Proceedings of the 6th International Conference on Extending Database Technology, Advances in Database Technology, 1998, pp. 183–197

  7. Calvanese, D., Giacomo, De.G., Lenzerini, M., Rosati, N.D.: Source integration in data warehouse. In: Proceedings of the 9th International Workshop on Database and Expert Systems Applications (DEXA ‘98), 1998, pp. 192–197

  8. Chen, W., Hong, T., Lin, W.W.: Using the compressed data model in object-oriented data warehousing. In: Proceedings of IEEE International Conference on Systems, Man, Cybernetics (IEEE SMC ’99), 1999, pp. 768–772

  9. Ezeife, I.C., Ohanekwu, E.T.: The use of smart tokens in cleaning integrated warehouse data. Int. J. Data Warehous. Min. 1(2), 1–22 (2005)

    Google Scholar 

  10. Le, D.X., Rahayu, J.W.: A dynamic approach for integrating web data warehouses. In: Proceedings of International Conference on Computational Science and Its Application (ICCSA ’06), pp. 207–216. Springer-Verlag, Berlin/Heidelberg (2006)

    Google Scholar 

  11. Filho, H.A., Prado, H.A., Toscani, S.S.: Evolving a legacy data warehouse system to an object oriented architecture. In: Proceedings of the XX International Conference of the Chilean Computer Science Society (SCCC ’00), Santiago, Chile, pp. 32–40. IEEE-Computer Society (2000)

  12. Gopalkrishman, V., Li, Q., Karlapalem, K.: Issues of object relational view design in data warehousing environment. In: Proceedings of the IEEE Conference on Systems Man and Cybernetics (SMC ’98), 1998, pp. 2732–2737

  13. Golfarelli, M., Rizzi, S., Birdoljak, B.: A conceptual design of data warehouses from E/R schema. In: Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences (HICSS ’98), Kohala Coast, Hawaii, USA, pp. 334–344. IEEE Computer Society (1998)

  14. Golfarelli, M., Rizzi, S., Birdoljak, B.: Data warehousing from XML sources. In: Proceedings of the 4th ACM International Workshop on Data Warehousing and OLAP (DOLAP ’01), Georgia, USA, pp. 40–47. ACM Press (2001)

  15. Gupta, A., Mumick, I.S.: Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng. Bull. 18(2), 3–18 (1995)

    Google Scholar 

  16. Hammer, J., Garcia-Molina, H., Widom, J., Labio, W., Zhuge, Y.: The stanford data warehousing project. IEEE Data Eng. Bull. 18(2), 40–47 (1995)

    Google Scholar 

  17. Huang, M.S., Su, H.C.: The development of an XML-based data warehouse system. In: Proceedings of the 3rd International Conference on Intelligent Data Engineering and Automated Learning, pp. 206–212. Springer-Verlag, Berlin/Heidelberg (2002)

    Chapter  Google Scholar 

  18. Huynh, N., Mangisengi, O., Tjoa, M.A.: Metadata for object relational data warehouse. In: Proceedings of the Second Intl. Workshop on Design and Management of Data Warehouses (DMDW ’00), 2000, pp. 3-1–3-9

  19. Hummer, W., Bauer, A., Harde, G.: XCube—XML for data warehouses. In: ACM 6th International Workshop on Data Warehousing and OLAP (DOLAP ’03), 2003, pp. 33–44

  20. Jensen, M., Moller, T., Pedersen, T.: Specifying OLAP cubes on XML data. J. Int. Inf. Syst. 17(3), 101–112 (2001)

    Google Scholar 

  21. Loney, K., Koch, G.: Oracle 9i: The Complete Reference, Osborne. McGraw-Hill, Berkeley (2000)

    Google Scholar 

  22. Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (ACM PODS ’02), 2002, pp. 233–246

  23. Li, S., Liu, M., Wang, G., Peng, Z.: Capturing semantic hierarchies to perform meaningful integration in HTML tables. In: Proceedings of the 6th Asia-Pacific Web Conference on Advanced Web Technologies and Applications (APWeb ‘04), 2004, pp. 899–902

  24. Melton, J. (ed.): Information technology—database languages—SQL—Part 14: XML-related specifications (SQL/XML). ISO/IEC 9075-14 (2003)

  25. Mohamah, S., Rahayu, W., Dillon, T.: Object relational star schemas. In: Proceeding of the 13th International Conference on Parallel and Distributed Computing and Systems (PDCS ’01), Anaheim, California. ACTA Press (2001)

  26. Miller, L.L., Honavar, V., Wong, J., Nilakanta, S.: Object-oriented data warehouse for information fusion from heterogeneous distributed data and knowledge sources. In: IEEE Information Technology, 1998, pp. 27–30

  27. Nassis, V., Rahayu, W., Rajugan, R., Dillon, T.: Conceptual design of XML document warehouses. In: Proceeding of the 6th International on Data Warehousing and Knowledge Discovery (DaWak, ‘04), 2004, pp. 1–14

  28. Nassis, V., Rajagopalapillai, R., Dillon, S.T., Rahayu, W.: Conceptual and systematic design approach for XML document warehouses Int. J. Data Warehous. Min. 1(3), 63–87 (2005)

    Google Scholar 

  29. Nummenmaa, J., Niemi, T., Niinimäki, M., Thanisch, P.: Constructing an OLAP cube on XML data. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP (DOLAP ’02), 2002, pp. 22–27

  30. Pardede, E., Rahayu, W.J., Taniar, D.: On using collection for aggregation and association relationships in XML object relational storage. In: Proceedings of the 2004 ACM Symposium on Applied Computing (SAC ’04), pp. 703–710. ACM Press, New York (2004)

    Chapter  Google Scholar 

  31. Pardede, E., Rahayu, W.J., Taniar, D.: Preserving conceptual constraints during XML updates. Int. J. Web Inf. Syst. 1(2), 65–82 (2005)

    Article  Google Scholar 

  32. Rahayu, W.J., Chang, E., Dillon, S.T., Taniar, D.: A methodology of transforming inheritance relationships in an object-oriented conceptual model to relational tables. Inf. Softw. Technol. J. 42(8), 571–592 (2000)

    Article  Google Scholar 

  33. Rahayu, J.W.: Object relational transformation. PhD Thesis of Computer Science and Computer Engineering, La Trobe University, Melbourne (1999)

  34. Rusu, I.L., Rahayu, W.J., Taniar, D.: On building XML data warehouses. In: Intelligent Data Engineering and Automated Learning, (IDEAL), LNCS vol. 3177/2004, pp. 293–299. Springer-Verlag, Berlin/Heidelberg (2004)

    Google Scholar 

  35. Rusu, I.L., Rahayu, W.J., Taniar, D.: Methodology for building XML data warehouses. Int. J. Data Warehous. Min. 1(2), 23–48 (2005)

    Google Scholar 

  36. Serrano, M., Calero, C., Piattini, M.: An experimental replication with data warehouse metrics. Int. J. Data Warehous. Min. 1(4), 1–21 (2005)

    Google Scholar 

  37. Taniar, D., Rahayu, W., Srivastava, P.: A taxonomy for object-relational queries, effective database for text & document management. In: Becker, S.A. (ed.) Effective Database For Text and Document Management, pp. 183–220. IDEA Group Publishing, USA (2003)

    Google Scholar 

  38. Widom, J.: Research problem in data warehouse. In: Proceedings of the 4th International Conference on Information and Knowledge Management, 1995, pp. 25–30

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Thi Dung.

Additional information

An erratum to this article can be found at http://dx.doi.org/10.1007/s10586-007-0024-9

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dung, X.T., Rahayu, W. & Taniar, D. A high performance integrated web data warehousing. Cluster Comput 10, 95–109 (2007). https://doi.org/10.1007/s10586-007-0008-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-007-0008-9

Keywords

Navigation