Skip to main content

Extraction, Transformation, and Loading

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Akkaoui ZE, Zimányi E, Mazón J, Trujillo J. A BPMN-based design and maintenance framework for ETL processes. Int J Data Warehouse Min. 2013;9(3):46–72.

    Article  Google Scholar 

  2. Dayal U, Castellanos M, Simitsis A, Wilkinson K. Data integration flows for business intelligence. In: Advances in Database Technology, Proceedings of the 12th International Conference on Extending Database Technology; 2009. p. 1–11.

    Google Scholar 

  3. Fagin R, Kolaitis PG, Popa L. Data exchange: getting to the core. ACM Trans Database Syst. 2005;30(1):174–210.

    Article  MATH  Google Scholar 

  4. Grund M, Krüger J, Plattner H, Zeier A, Cudré-Mauroux P, Madden S. HYRISE – a main memory hybrid storage engine. Proc. VLDB Endowment. 2010;4(2):105–16.

    Article  Google Scholar 

  5. Haas LM, Hernández MA, Ho H, Popa L, Roth M. Clio grows up: from research prototype to industrial tool. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2005. p. 805–10.

    Google Scholar 

  6. Halasipuram R, Deshpande PM, Padmanabhan S. Determining essential statistics for cost based optimization of an ETL workflow. In: Proceedings of the 17th International Conference on Extending Database Technology; 2014. p. 307–18.

    Google Scholar 

  7. Inmon W. Building the data warehouse. 2nd ed. New York: John Wiley & Sons; 1996.

    Google Scholar 

  8. Kemper A, Neumann T. Hyper: a hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: Proceedings of the 27th International Conference on Data Engineering; 2011. p. 195–206.

    Google Scholar 

  9. Kimbal R, Reeves L, Ross M, Thornthwaite W. The data warehouse lifecycle toolkit: expert methods for designing, developing, and deploying data warehouses. New York: Wiley; 1998.

    Google Scholar 

  10. Labio W, Garcia-Molina H. Efficient snapshot differential algorithms for data warehousing. In: Proceedings of the 22th International Conference on Very Large Data Bases; 1996. p. 63–74.

    Google Scholar 

  11. Labio W, Wiener JL, Garcia-Molina H, Gorelik V. Efficient resumption of interrupted warehouse loads. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 46–57.

    Google Scholar 

  12. Lenzerini M. Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2002. p. 233–46.

    Google Scholar 

  13. Liu X, Thomsen C, Pedersen TB. ETLMR: a highly scalable dimensional ETL framework based on mapreduce. Trans Large-Scale Data- Knowl-Cent Syst. 2013;8:1–31.

    Google Scholar 

  14. Luján-Mora S, Vassiliadis P, Trujillo J. Data mapping diagrams for data warehouse design with UML. In: Proceedings of the 23rd International Conference on Conceptual Modeling; 2004. p. 191–204.

    Google Scholar 

  15. Oracle. Oracle9i SQL Reference. Release 9.2; 2002.

    Google Scholar 

  16. Rahm E, Bernstein PA. A survey of approaches to automatic schema matching. VLDB J. 2001;10(4): 334–50.

    Article  MATH  Google Scholar 

  17. Rizzi S, Abelló A, Lechtenbörger J, Trujillo J. Research in data warehouse modeling and design: dead or alive? In: Proceedings of the ACM 9th International Workshop on Data Warehousing and OLAP; 2006. p. 3–10.

    Google Scholar 

  18. Romero O, Simitsis A, Abelló A. GEM: requirement-driven generation of ETL and multidimensional conceptual designs. In: Proceedings of the 13th International Conference on Data Warehousing and Knowledge Discovery; 2011. p. 80–95.

    Chapter  Google Scholar 

  19. Roth MT, Schwarz PM. Don’t scrap it, wrap it! a wrapper architecture for legacy data sources. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 266–75.

    Google Scholar 

  20. Shu NC, Housel BC, Taylor RW, Ghosh SP, Lum VY. Express: a data extraction, processing, amd restructuring system. ACM Trans Database Syst. 1977;2(2):134–74.

    Article  Google Scholar 

  21. Simitsis A, Vassiliadis P, Sellis TK. Optimizing ETL processes in data warehouses. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 564–75.

    Google Scholar 

  22. Simitsis A, Vassiliadis P, Sellis TK. State-space optimization of ETL workflows. IEEE Trans Knowl Data Eng. 2005;17(10):1404–19.

    Article  Google Scholar 

  23. Simitsis A, Wilkinson K, Castellanos M, Dayal U. Qox-driven ETL design: reducing the cost of ETL consulting engagements. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; p. 953–60.

    Google Scholar 

  24. Simitsis A, Wilkinson K, Castellanos M, Dayal U. Optimizing analytic data flows for multiple execution engines. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2012. p. 829–40.

    Google Scholar 

  25. Skoutas D, Simitsis A. Designing ETL processes using semantic web technologies. In: Proceedings of the ACM 9th International Workshop on Data Warehousing and OLAP; 2006. p. 67–74.

    Google Scholar 

  26. Thomsen C, Pedersen TB. Easy and effective parallel programmable ETL. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP; 2011. p. 37–44.

    Google Scholar 

  27. TPC. TPC-DS (Decision Support) specification, draft version 52; Feb 2007.

    Google Scholar 

  28. Trujillo J, Luján-Mora S. A UML based approach for modeling ETL processes in data warehouses. In: Proceedings of the 22nd International Conference on Conceptual Modeling; 2003. p. 307–20.

    Chapter  Google Scholar 

  29. Vassiliadis P, Karagiannis A, Tziovara V, Vassiliadis P, Simitsis A. Towards a benchmark for ETL workflows. In: Proceedings of the 5th International Workshop on Quality in Databases at VLDB; 2007.

    Google Scholar 

  30. Vassiliadis P, Simitsis A, Skiadopoulos S. Conceptual modeling for ETL processes. In: Proceedings of the ACM 5th International Workshop on Data Warehousing and OLAP; 2002. p. 14–21.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alkis Simitsis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Simitsis, A., Vassiliadis, P. (2018). Extraction, Transformation, and Loading. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_158

Download citation

Publish with us

Policies and ethics