skip to main content
10.1145/1943628.1943666acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfitConference Proceedingsconference-collections
research-article

Real-time data warehousing for business intelligence

Published:21 December 2010Publication History

ABSTRACT

Real-time ETL and data warehouse multidimensional modeling (DMM) of business operational data has become an important research issue in the area of real-time data warehousing (RTDW). In this study, some of the recently proposed real-time ETL technologies from the perspectives of data volumes, frequency, latency, and mode have been discussed. In addition, we highlight several advantages of using semi-structured DMM (i.e. XML) in RTDW instead of traditional structured DMM (i.e., relational). We compare the two DMMs on the basis of four characteristics: heterogeneous data integration, types of measures supported, aggregate query processing, and incremental maintenance. We implemented the RTDW framework for an example telecommunication organization. Our experimental analysis shows that if the delay comes from the incremental maintenance of DMM, no ETL technology (full-reloading or incremental-loading) can help in real-time business intelligence.

References

  1. Bruckner, R. M., List, B., and Schiefer, J. 2002. Striving towards near real-time data integration for data warehouses. In Proc. of Int. Conf. DaWaK, LNCS 2454, 317--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bouzeghoub. M., Fabret. F., and Matulovic. M. 1999. Modeling data warehouse refreshment process as a workflow application. In Proc. of Int. Workshop on DWDM.Google ScholarGoogle Scholar
  3. Simitsis, A., Vassiliadis, P., and Sellis, T. 2005. Optimizing ETL processes in data warehouses. In Proc. of Int. Conf. on Data Eng., 564--575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., and Sellis, T. 2001. ARKTOS: towards the modeling, design, control and execution of ETL processes. J. of Inf. Systs. 26, 8, 537--561. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Italiano, C. and Ferreira, J. E. 2006. Synchronization options for data warehouse designs. Computer 39, 3, 53--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Santos, R. J. and Bernardio, J. 2008. Real-time data warehouse loading methodology. In Proc. of Int. Symposium on Database Eng. and Applications, 49--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Abrahiem, R. 2007. A new generation of middleware solutions for a near-real-time data warehousing architecture. In Proc. of IEEE Int. Conf. on Electro/Info. Tech., 192--197.Google ScholarGoogle ScholarCross RefCross Ref
  8. Going real-time for data warehousing and operational BI, 2009. GoldenGate Software Inc., Available from http://datasolutions.searchdatamanagement.com/document;5132934/datamgmt-abstract.htm.Google ScholarGoogle Scholar
  9. Stonebraker, M. 2007. The problem with one-size-fits all databases. Article published in REDHAT Magazine, Available from: http://magazine.redhat.com/2007/04/13/the-problem-with-one-size-fits-all-databases/.Google ScholarGoogle Scholar
  10. Zhu, Y., An, L., and Liu, S. 2008. Data updating and query in real-time data warehouse system. In Proc. of IEEE Int. Conf. on Comp. Sci. and Software Eng., 1295--1297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Li, C. and Wang, X. S. 1996. A data model for supporting on-line analytical processing. In Proc. of Int. Conf. on Inf. and Knowledge Management, 81--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Agarwal, R., Gupta, A., and Sarawagi, S. 1997. Modeling multidimensional databases. In Proc. of Int. Conf. on Data Eng., 232--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gyssens, M. and Lakshmanan, L. V. S. 1997. A foundation for multidimensional databases. In Proc. of Int. Conf. Very Large Data Bases, 106--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cabibbo, L. and Torlone, R. 1998. A logical approach to multidimensional databases. In Proc. of. Int. Conf. on Extended Database Tech., LNCS 1377, 183--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Datta, A. and Thomas, H. 1999. The cube data model: a conceptual model and algebra for on-line analytical processing in data warehouses. Decision Support Systs. 27, 3, 289--301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Pederson, T. B. and Jensen, C. S. 1999. Multidimensional data modeling for complex data. In Proc. of Int. Conf. on Data Eng., 336--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Golfarelli, M., Rizzi, S., and Vrdoljak, B. 2001. Data warehouse design from XML sources. In Proc. of ACM Int. Workshop on Data warehousing and OLAP, 40--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hummer, W., Bauer, A., and Harde, G. 2003. XCube, XML for data warehouses. In Proc. of ACM Int. Workshop Data warehousing and OLAP, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rusu, L. I., Rahayu, W., and Taniar, D. 2004. On building XML data warehouses, Intelligent Data Eng. and Automated Learning 3177, 293--299.Google ScholarGoogle Scholar
  20. Levy, A., Rajaraman, A., and Ordille, J. 1996. Querying heterogeneous information sources using source descriptions. In Proc. of Int. Conf. on Very Large Data Bases, 251--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Cappellen, M., Cordewiner, W., and Innocenti, C. 2008. Data aggregation, heterogeneous data sources and streaming processing: how can XQuery help?. IEEE Data Engineering Bulletin 31, 4, 57--64.Google ScholarGoogle Scholar
  22. Boussaid, O., Darmont, J., Bentayeb, F., and Loudcher, S. 2008. Warehousing complex data from the web. Int. J. of Web Eng. and Tech. 4, 4, 408--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Gupta, A., Harinarayan, V., and Quass, D. 1995. Aggregate-query processing in data warehousing environments. In Proc. of Int. Conf. on Very Large Data Bases, 358--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wang, M. and Iyer, B. 1997. Efficient roll-up and drill-down analysis in relational databases. In SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 39--43.Google ScholarGoogle Scholar
  25. Kim, W. 1982. On optimizing an sql-like nested query. ACM Trans. Database Syst. 7, 3, 443--469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Gray, J., Keulen, M., and Teubner, J. 2004. Accelrating XPath evaluation in any RDBMS. ACM Trans. Database Syst. 29, 91--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Beyer, K., Chamberlin, D., Colby, L. S., Ozcan, F., Pirahesh, H., and Xu, Y. 2005. Extending XQuery for analytics. In Proc. of Int. Conf. on Management of Data ACM SIGMOD, 503--514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Engovatov, D. 2007. XML query (Xquery) 1.1 requirements. W3C Working Draft.Google ScholarGoogle Scholar
  29. Wu, H., Ling, T. W., Xu, L., and Bao, Z. 2009. Performing grouping and aggregate functions in XML queries. In Proc. of Int. Conf. on World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zhao, Y., Ma, T., and Liu, F. 2010. Research on index technology for group-by aggregation query in XML cube. Inf. Tech. J. 9, 10, 116--123.Google ScholarGoogle ScholarCross RefCross Ref
  31. Jin, D., Tsuji, T., Tsuchida, T., and Higuchi, H. 2009. An incremental maintenance scheme of data cubes. Databases Systs. for Advance Applications LNCS 4947, 172--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kotidis, Y. and Roussopoulos, N. 1998. An alternative storage organization for ROLAP aggregate views based on cubetrees. In Proc. of ACM SIGMOD Int. Conf. on Management of Data, 249--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Baril. X. and Ballahsene, Z. 2006. Incremental method for XML view maintenance in case of non-monitored data sources. Theory and Practice of computer Science LNCS 3831, 148--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Choi, B., Cong, G., Fan. W., and Viglas, S. D. 2008. Updating recursive XML views of relations, J. of Comp. Sci. and Tech. 23, 4, 516--537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Chamberlin, D., Florescu, and D., Robie, J. 2006. Xquery Update Facility, W3C, Avalaible from http://www.w3.org/Tr/xqupdate.Google ScholarGoogle Scholar
  36. Foster, J. N., Simeon, R. K, and Villard, J., L. 2008. An algebraic approach to Xquery view maintenance, ACM SIGPLAN Workshop on Prog. Lang. Tech. for XML.Google ScholarGoogle Scholar
  37. Jorg, T. and Dessloch, S. 2009. Formalizing ETL jobs for incremental loading of data warehouses, Business Tech. and Web, 57--64.Google ScholarGoogle Scholar
  38. Labio, W. and Garcia-Molina, H. 1996. Efficient snapshot differential algorithms in data warehousing. In Proc. of Int. Conf. on Very Large Data Bases. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jörg, T. and Dessloch, S. 2010. Near real-time data warehousing using state-of-the-art ETL tools. Enabling Real-Time Business Intelligence, Lecture Notes in Business Inf. Processing 41, 100--117.Google ScholarGoogle ScholarCross RefCross Ref
  40. Shi, J., Bao, Y., Leng, and F., Yu, G. 2009. Priority-based balance scheduling in real-time data warehouse, In Proc. of IEEE Int. Conf. on Hybrid Intelligent Systs, 301--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Pentaho Open Source Business Intelligence. Available from http://kettle.pentaho.com/Google ScholarGoogle Scholar
  42. XQSharp: Xquery for the .Net Framework, Available from http://www.xqsharp.com/xqsharp/Google ScholarGoogle Scholar

Index Terms

  1. Real-time data warehousing for business intelligence

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Other conferences
                  FIT '10: Proceedings of the 8th International Conference on Frontiers of Information Technology
                  December 2010
                  281 pages
                  ISBN:9781450303422
                  DOI:10.1145/1943628

                  Copyright © 2010 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 21 December 2010

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader