Abstract
Extraction-transformation-loading (ETL) processes play an important role in a data warehouse (DW) architecture because they are responsible of integrating data from heterogeneous data sources into the DW repository. Importantly, most of the budget of a DW project is spent on designing these processes since they are not taken into account in the early phases of the project but once the repository is deployed. In order to overcome this situation, we propose using the unified modelling language (UML) to conceptually model the sequence of activities involved in ETL processes from the beginning of the project by using activity diagrams (ADs). Our approach provides designers with easy-to-use modelling elements to capture the dynamic aspects of ETL processes.
Supported by Spanish projects ESPIA (TIN2007-67078) and QUASIMODO (PAC08-0157-0668). Lilia Muñoz is funded by SENACYT and IFARHU of the Republic of Panama. Jose-Norberto Mazón and Jesús Pardillo are funded under Spanish FPU grants AP2005-1360 and AP2006-00332, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Inmon, W.: Building the Data Warehouse. Wiley, Chichester (1992)
Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit. Wiley, Chichester (2004)
Oracle: Oracle Warehouse Builder 10g, http://www.oracle.com
Microsoft: SQL Server 2005 Integration Services (SSIS), http://technet.microsoft.com/enus/sqlserver/bb331782.aspx
IBM: WebSphere DataStage, http://www.ibm.com
Jarke, M., Lenzerini, M., Vassiliou, Y., Vassiliadis, P.: Fundamentals of Data Warehouses. Springer, Heidelberg (2000)
Shilakes, C., Tylman, J.: Enterprise Information Portals. Enterprise Software Team, http://sagemaker.com/company/downloads/eip/indepth.pdf
Demarest, M.: The politics of data warehousing, http://www.hevanet.com/demarest/marc/dwpol.html
OMG: Unified Modelling Language. Version 2.0 (2005), http://www.omg.org
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual modeling for ETL processes. In: DOLAP (2002)
Simitsis, A., Vassiliadis, P.: A Methodology for the Conceptual Modeling of ETL Processes. In: CAiSE Workshops (2003)
Trujillo, J., Luján, S.: A UML Based Approach for Modeling ETL Processes in Data Warehouses. In: ER, pp. 307–320 (2003)
Luján, S., Vassiliadis, P., Trujillo, J.: Data Mapping Diagrams for Data Warehouse Design with UML. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, T.-W. (eds.) ER 2004. LNCS, vol. 3288, pp. 191–204. Springer, Heidelberg (2004)
Skoutas, D., Simitsis, A.: Designing ETL processes using semantic web technologies. In: DOLAP, pp. 67–74 (2006)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: On the Logical Modeling of ETL Processes. In: CAiSE, pp. 782–786 (2002)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL activities as graphs. In: DMDW, pp. 52–61 (2002)
Vassiliadis, P., Vagena, Z., Skiadopulos, S., Karayannidis, N., Sellis, T.: ARKTOS: towards the modeling, design, control and excution of ETL processes. Information Systems, 24 (2001)
Simitsis, A., Vassiliadis, P., Terrovitis, M., Skiadopoulos, S.: Graph-Based Modeling of ETL Activities with Multi-level Transformations and Updates. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 43–52. Springer, Heidelberg (2005)
Simitsis, A., Vassiliadis, P., Skiadopoulos, S., Sellis, T.: Data Warehouse Refreshment. In: Data Warehouses and OLAP: Concepts, Architectures and Solutions, IRM Press (2006)
Tziovara, V., Vassiliadis, P., Simitsis, A.: Deciding the physical implementation of ETL workflows. In: DOLAP, pp. 49–56 (2007)
Simitsis, A., Vassiliadis, P., Sellis, T.: State-Space Optimization of ETL Workflows. IEEE Trans. Knowl. Data Eng. 17(10), 1404–1419 (2005)
Bock, C.: UML 2 Activity and Action Models, Part 2: Actions. Journal of Object Technology 2(5), 41–56 (2003)
Mazón, J.-N., Trujillo, J., Serrano, M., Piattini, M.: Applying MDA to the development of data warehouses. In: DOLAP, pp. 57–66 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muñoz, L., Mazón, JN., Pardillo, J., Trujillo, J. (2008). Modelling ETL Processes of Data Warehouses with UML Activity Diagrams. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2008 Workshops. OTM 2008. Lecture Notes in Computer Science, vol 5333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88875-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-88875-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88874-1
Online ISBN: 978-3-540-88875-8
eBook Packages: Computer ScienceComputer Science (R0)