ABSTRACT
Real-time ETL and data warehouse multidimensional modeling (DMM) of business operational data has become an important research issue in the area of real-time data warehousing (RTDW). In this study, some of the recently proposed real-time ETL technologies from the perspectives of data volumes, frequency, latency, and mode have been discussed. In addition, we highlight several advantages of using semi-structured DMM (i.e. XML) in RTDW instead of traditional structured DMM (i.e., relational). We compare the two DMMs on the basis of four characteristics: heterogeneous data integration, types of measures supported, aggregate query processing, and incremental maintenance. We implemented the RTDW framework for an example telecommunication organization. Our experimental analysis shows that if the delay comes from the incremental maintenance of DMM, no ETL technology (full-reloading or incremental-loading) can help in real-time business intelligence.
- Bruckner, R. M., List, B., and Schiefer, J. 2002. Striving towards near real-time data integration for data warehouses. In Proc. of Int. Conf. DaWaK, LNCS 2454, 317--326. Google ScholarDigital Library
- Bouzeghoub. M., Fabret. F., and Matulovic. M. 1999. Modeling data warehouse refreshment process as a workflow application. In Proc. of Int. Workshop on DWDM.Google Scholar
- Simitsis, A., Vassiliadis, P., and Sellis, T. 2005. Optimizing ETL processes in data warehouses. In Proc. of Int. Conf. on Data Eng., 564--575. Google ScholarDigital Library
- Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., and Sellis, T. 2001. ARKTOS: towards the modeling, design, control and execution of ETL processes. J. of Inf. Systs. 26, 8, 537--561. Google ScholarDigital Library
- Italiano, C. and Ferreira, J. E. 2006. Synchronization options for data warehouse designs. Computer 39, 3, 53--57. Google ScholarDigital Library
- Santos, R. J. and Bernardio, J. 2008. Real-time data warehouse loading methodology. In Proc. of Int. Symposium on Database Eng. and Applications, 49--58. Google ScholarDigital Library
- Abrahiem, R. 2007. A new generation of middleware solutions for a near-real-time data warehousing architecture. In Proc. of IEEE Int. Conf. on Electro/Info. Tech., 192--197.Google ScholarCross Ref
- Going real-time for data warehousing and operational BI, 2009. GoldenGate Software Inc., Available from http://datasolutions.searchdatamanagement.com/document;5132934/datamgmt-abstract.htm.Google Scholar
- Stonebraker, M. 2007. The problem with one-size-fits all databases. Article published in REDHAT Magazine, Available from: http://magazine.redhat.com/2007/04/13/the-problem-with-one-size-fits-all-databases/.Google Scholar
- Zhu, Y., An, L., and Liu, S. 2008. Data updating and query in real-time data warehouse system. In Proc. of IEEE Int. Conf. on Comp. Sci. and Software Eng., 1295--1297. Google ScholarDigital Library
- Li, C. and Wang, X. S. 1996. A data model for supporting on-line analytical processing. In Proc. of Int. Conf. on Inf. and Knowledge Management, 81--88. Google ScholarDigital Library
- Agarwal, R., Gupta, A., and Sarawagi, S. 1997. Modeling multidimensional databases. In Proc. of Int. Conf. on Data Eng., 232--243. Google ScholarDigital Library
- Gyssens, M. and Lakshmanan, L. V. S. 1997. A foundation for multidimensional databases. In Proc. of Int. Conf. Very Large Data Bases, 106--115. Google ScholarDigital Library
- Cabibbo, L. and Torlone, R. 1998. A logical approach to multidimensional databases. In Proc. of. Int. Conf. on Extended Database Tech., LNCS 1377, 183--197. Google ScholarDigital Library
- Datta, A. and Thomas, H. 1999. The cube data model: a conceptual model and algebra for on-line analytical processing in data warehouses. Decision Support Systs. 27, 3, 289--301. Google ScholarDigital Library
- Pederson, T. B. and Jensen, C. S. 1999. Multidimensional data modeling for complex data. In Proc. of Int. Conf. on Data Eng., 336--345. Google ScholarDigital Library
- Golfarelli, M., Rizzi, S., and Vrdoljak, B. 2001. Data warehouse design from XML sources. In Proc. of ACM Int. Workshop on Data warehousing and OLAP, 40--47. Google ScholarDigital Library
- Hummer, W., Bauer, A., and Harde, G. 2003. XCube, XML for data warehouses. In Proc. of ACM Int. Workshop Data warehousing and OLAP, 33--40. Google ScholarDigital Library
- Rusu, L. I., Rahayu, W., and Taniar, D. 2004. On building XML data warehouses, Intelligent Data Eng. and Automated Learning 3177, 293--299.Google Scholar
- Levy, A., Rajaraman, A., and Ordille, J. 1996. Querying heterogeneous information sources using source descriptions. In Proc. of Int. Conf. on Very Large Data Bases, 251--262. Google ScholarDigital Library
- Cappellen, M., Cordewiner, W., and Innocenti, C. 2008. Data aggregation, heterogeneous data sources and streaming processing: how can XQuery help?. IEEE Data Engineering Bulletin 31, 4, 57--64.Google Scholar
- Boussaid, O., Darmont, J., Bentayeb, F., and Loudcher, S. 2008. Warehousing complex data from the web. Int. J. of Web Eng. and Tech. 4, 4, 408--433. Google ScholarDigital Library
- Gupta, A., Harinarayan, V., and Quass, D. 1995. Aggregate-query processing in data warehousing environments. In Proc. of Int. Conf. on Very Large Data Bases, 358--369. Google ScholarDigital Library
- Wang, M. and Iyer, B. 1997. Efficient roll-up and drill-down analysis in relational databases. In SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 39--43.Google Scholar
- Kim, W. 1982. On optimizing an sql-like nested query. ACM Trans. Database Syst. 7, 3, 443--469. Google ScholarDigital Library
- Gray, J., Keulen, M., and Teubner, J. 2004. Accelrating XPath evaluation in any RDBMS. ACM Trans. Database Syst. 29, 91--131. Google ScholarDigital Library
- Beyer, K., Chamberlin, D., Colby, L. S., Ozcan, F., Pirahesh, H., and Xu, Y. 2005. Extending XQuery for analytics. In Proc. of Int. Conf. on Management of Data ACM SIGMOD, 503--514. Google ScholarDigital Library
- Engovatov, D. 2007. XML query (Xquery) 1.1 requirements. W3C Working Draft.Google Scholar
- Wu, H., Ling, T. W., Xu, L., and Bao, Z. 2009. Performing grouping and aggregate functions in XML queries. In Proc. of Int. Conf. on World Wide Web. Google ScholarDigital Library
- Zhao, Y., Ma, T., and Liu, F. 2010. Research on index technology for group-by aggregation query in XML cube. Inf. Tech. J. 9, 10, 116--123.Google ScholarCross Ref
- Jin, D., Tsuji, T., Tsuchida, T., and Higuchi, H. 2009. An incremental maintenance scheme of data cubes. Databases Systs. for Advance Applications LNCS 4947, 172--187. Google ScholarDigital Library
- Kotidis, Y. and Roussopoulos, N. 1998. An alternative storage organization for ROLAP aggregate views based on cubetrees. In Proc. of ACM SIGMOD Int. Conf. on Management of Data, 249--258. Google ScholarDigital Library
- Baril. X. and Ballahsene, Z. 2006. Incremental method for XML view maintenance in case of non-monitored data sources. Theory and Practice of computer Science LNCS 3831, 148--157. Google ScholarDigital Library
- Choi, B., Cong, G., Fan. W., and Viglas, S. D. 2008. Updating recursive XML views of relations, J. of Comp. Sci. and Tech. 23, 4, 516--537. Google ScholarDigital Library
- Chamberlin, D., Florescu, and D., Robie, J. 2006. Xquery Update Facility, W3C, Avalaible from http://www.w3.org/Tr/xqupdate.Google Scholar
- Foster, J. N., Simeon, R. K, and Villard, J., L. 2008. An algebraic approach to Xquery view maintenance, ACM SIGPLAN Workshop on Prog. Lang. Tech. for XML.Google Scholar
- Jorg, T. and Dessloch, S. 2009. Formalizing ETL jobs for incremental loading of data warehouses, Business Tech. and Web, 57--64.Google Scholar
- Labio, W. and Garcia-Molina, H. 1996. Efficient snapshot differential algorithms in data warehousing. In Proc. of Int. Conf. on Very Large Data Bases. Google ScholarDigital Library
- Jörg, T. and Dessloch, S. 2010. Near real-time data warehousing using state-of-the-art ETL tools. Enabling Real-Time Business Intelligence, Lecture Notes in Business Inf. Processing 41, 100--117.Google ScholarCross Ref
- Shi, J., Bao, Y., Leng, and F., Yu, G. 2009. Priority-based balance scheduling in real-time data warehouse, In Proc. of IEEE Int. Conf. on Hybrid Intelligent Systs, 301--306. Google ScholarDigital Library
- Pentaho Open Source Business Intelligence. Available from http://kettle.pentaho.com/Google Scholar
- XQSharp: Xquery for the .Net Framework, Available from http://www.xqsharp.com/xqsharp/Google Scholar
Index Terms
- Real-time data warehousing for business intelligence
Recommendations
24/7 Real-Time Data Warehousing: A Tool for Continuous Actionable Knowledge
COMPSAC '11: Proceedings of the 2011 IEEE 35th Annual Computer Software and Applications ConferenceTechnological evolution has redefined many business models. Many decision makers are now required to act near real-time, instead of periodically, given the latest transactional information. Decision-making occurs much more frequently and considers the ...
A rewrite/merge approach for supporting real-time data warehousing via lightweight data integration
AbstractThis paper proposes and experimentally assesses a rewrite/merge approach for supporting real-time data warehousing via lightweight data integration. Real-time data warehouses are becoming more and more relevant actually, due to emerging research ...
Comments