Skip to main content

Two-ETL Phases for Data Warehouse Creation: Design and Implementation

  • Conference paper
  • First Online:
Advances in Databases and Information Systems (ADBIS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9282))

  • 1130 Accesses

Abstract

Building the ETL process is potentially one of the biggest tasks of building a warehouse. In fact, it is complex, time consuming, and consumes most of data warehouse projects implementation efforts, costs, and resources. Nevertheless, the difference on data structures imposes new requirements on the ETL process implementation and maintenance. What makes these tasks even more challenging is the fact that data continue to grow rapidly and business requirements change over time. In this paper, we propose a method that contains Two-ETL phases, one treats the pre-treatment phase and another deals with the actual ETL. Our method consists on determining the correspondence table, modeling new operations using the Business Process Modeling Notation (BPMN) and implementing these operations with Talend Open Source (TOS). In addition, our method allows the design of ETL process in an earlier stage, which enormously facilitates the implementation of this process. Another advantage of our proposal is the use of the BPMN which allows to cover a deficit of communication that often occurs between the design and implementation of business processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Towards a new Manner to use Affordable Technologies and Social Networks to Improve Business for Women in Emerging Countries.

  2. 2.

    http://www.talend.com.

  3. 3.

    http://www.pentaho.fr/explore/pentaho-data-integration/.

  4. 4.

    http://www.onat.nat.tn/accueil/.

References

  1. Golfarelli, M.: From user requirements to conceptual design in data warehouse design-a survey. In: Data Warehousing Design and Advanced Engineering Applications: Methods for Complex Construction, pp. 6–11 (2010)

    Google Scholar 

  2. Nabli, A.: Approche d’aide à la conception automatisée d’entrepôt de données: Guide de modèlisation. Presses Acadmiques Francophones (2013)

    Google Scholar 

  3. Favre, C., Bentayeb, F., Boussaid, O., Darmont, J., Gavin, G., Harbi, N., Kabachi, N., Loudcher, S.: Les entrepôts de données pour les nuls. ou pas!. In: 2éme Atelier aide à la Décision à tous les Etages (EGC/AIDE), Janvier 2013

    Google Scholar 

  4. Trujillo, J., Luján-Mora, S.: A uml based approach for modeling ETL processes in data warehouses. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 307–320. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Mallek, H., Walha, A., Ghozzi, F., Gargouri, F.: ETL-web process modeling. In: ASD Advances on Decisional Systems Conference (2014)

    Google Scholar 

  6. El-Sappagh, A., Hendawi, A., Bastawissy, H.: A proposed model for data warehouse ETL processes. J. King Saud Univ. Comput. Inf. Sci. 23(2), 91–104 (2011)

    Google Scholar 

  7. Muñoz, L., Mazón, J.-N., Pardillo, J., Trujillo, J.: Modelling ETL processes of data warehouses with UML activity diagrams. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2008. LNCS, vol. 5333, pp. 44–53. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Munoz, L., Mazon, J., Trujillo, J.: Automatic generation of ETL processes from conceptual models. In: Data Warehousing and OLAP, pp. 33–40 (2009)

    Google Scholar 

  9. Atigui, F., Ravat, F., Teste, O., Zurfluh, G.: Using OCL for automatically producing multidimensional models and ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 42–53. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. El Akkaoui, Z., Zimanyi, E.: Defining ETL worfklows using BPMN and BPEL. In: Data Warehousing and OLAP, pp. 41–48 (2009)

    Google Scholar 

  11. El Akkaoui, Z., Mazón, J.-N., Vaisman, A., Zimányi, E.: BPMN-based conceptual modeling of ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 1–14. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Oliveira, B., Belo, O.: BPMN patterns for ETL conceptual modelling and validation. In: Chen, L., Felfernig, A., Liu, J., Raś, Z.W. (eds.) ISMIS 2012. LNCS, vol. 7661, pp. 445–454. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  13. Wilkinson, K., Simitsis, A., Castellanos, M., Dayal, U.: Leveraging business process models for ETL design. In: Parsons, J., Saeki, M., Shoval, P., Woo, C., Wand, Y. (eds.) ER 2010. LNCS, vol. 6412, pp. 15–30. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Jovanovic, P., Romero, O., Simitsis, A., Abelló, A.: Requirement-driven creation and deployment of multidimensional and ETL designs. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V.S., Lee, M.L. (eds.) ER 2012 Workshops 2012. LNCS, vol. 7518, pp. 391–395. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rania Yangui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Nabli, A., Bouaziz, S., Yangui, R., Gargouri, F. (2015). Two-ETL Phases for Data Warehouse Creation: Design and Implementation. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23135-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23134-1

  • Online ISBN: 978-3-319-23135-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics