Skip to main content

An Approach to Evolution Management in Integrated Heterogeneous Data Sources

  • Conference paper
  • First Online:
Enterprise Information Systems (ICEIS 2021)

Abstract

In this paper we target the current problem of evolution of heterogeneous data sources of a data warehouse. Evolution may be caused by changes in the structure of data sources that are often independent from a data warehouse as well as by changes in information requirements. The solution we introduce in this paper is based on the architecture of a data analysis system that apart from a data highway that collects and transforms data also employs a metadata repository and various tools that provide different kinds of analysis of stored data. The unique feature of our solution is an adaptation component that incorporates mechanisms for automatic discovery of changes in the structure of integrated data sets and propagation of these changes in a data warehouse and other components of a data analysis system. In addition to the presentation of our approach, we give details of approbation of our software prototype in the case study system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bentayeb, F., Favre, C., Boussaid, O.: A user-driven data warehouse evolution approach for concurrent personalized analysis needs. Integr. Comput.-Aided Eng. 15(1), 21–36 (2008)

    Article  Google Scholar 

  2. Wojciechowski, A.: ETL workflow reparation by means of case-based reasoning. Inf. Syst. Front. 20, 21–43 (2018)

    Article  Google Scholar 

  3. Ahmed, W., Zimányi, E., Wrembel, R.: A logical model for multiversion data warehouses. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 23–34. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10160-6_3

    Chapter  Google Scholar 

  4. Golfarelli, M., Lechtenbörger, J., Rizzi, S., Vossen, G.: Schema versioning in data warehouses: enabling cross-version querying via schema augmentation. Data Knowl. Eng. 59(2), 435–459 (2006)

    Article  Google Scholar 

  5. Malinowski, E., Zimányi, E.: A conceptual model of temporal data warehouses and its transformation to the ER and object-relational models. Data Knowl. Eng. 64(1), 101–133 (2008)

    Article  Google Scholar 

  6. Thenmozhi, M., Vivekanandan, K.: An ontological approach to handle multidimensional schema evolution for data warehouse. Int. J. Database Manag. Syst. 6(3), 33–52 (2014)

    Article  Google Scholar 

  7. Thakur, G., Gosain, A.: DWEVOLVE: a requirement based framework for data warehouse evolution. ACM SIGSOFT Softw. Eng. Notes 36(6), 1–8 (2011)

    Article  Google Scholar 

  8. Kaisler, S., Armour, F., Espinosa, J.A., Money, W: Big data: issues and challenges moving forward. In: Proceedings of the 2013 46th Hawaii International Conference on System Sciences, HICSS 2013, pp. 995–1004. IEEE Computer Society (2013). https://doi.org/10.1109/HICSS.2013.645

  9. Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: current challenges and future research directions. In: Proceedings of the Sixteenth International Workshop on Data Warehousing and OLAP (DOLAP 2013), San Francisco, California, USA, pp. 67–70 (2013)

    Google Scholar 

  10. Holubová, I., Klettke, M., Störl, U.: Evolution management of multi-model data. In: Gadepally, V., et al. (eds.) DMAH/Poly -2019. LNCS, vol. 11721, pp. 139–153. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33752-0_10

    Chapter  Google Scholar 

  11. Solodovnikova, D., Niedrite, L.: Handling evolution in big data architectures. Balt. J. Mod. Comput. 8(1), 21–47 (2020)

    Google Scholar 

  12. Sumbaly, R., Kreps, J., Shah, S.: The big data ecosystem at linkedin. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, pp. 1125–1134. ACM, New York (2013). https://doi.org/10.1145/2463676.2463707

  13. Chen, S.: Cheetah: a high performance, custom data warehouse on top of MapReduce. VLDB Endow. 3(2), 1459–1468 (2010)

    Article  Google Scholar 

  14. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Hoboken (2013)

    Google Scholar 

  15. Nadal, S., Romero, O., Abelló, A., Vassiliadis, P., Vansummeren, S.: An integration-oriented ontology to govern evolution in Big Data ecosystems. In: Workshops of the EDBT/ICDT 2017 Joint Conference (2017)

    Google Scholar 

  16. Wang, Z., Zhou, L., Das, A., Dave, V., Jin, Z., Zou, J.: Survive the schema changes: integration of unmanaged data using deep learning. arXiv preprint arXiv:2010.07586 (2020)

  17. Holubová, I., Vavrek, M., Scherzinger, S.: Evolution management in multi-model databases. Data Knowl. Eng. 136 (2021)

    Google Scholar 

  18. Solodovnikova, D., Niedrite, L., Niedritis, A.: On metadata support for integrating evolving heterogeneous data sources. In: Welzer, T., et al. (eds.) ADBIS 2019. CCIS, vol. 1064, pp. 378–390. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30278-8_38

    Chapter  Google Scholar 

  19. Quix, C., Hai, R., Vatov, I.: Metadata extraction and management in data lakes with GEMMS. Complex Syst. Inform. Model. Q. 9, 67–83 (2016)

    Article  Google Scholar 

  20. Solodovnikova, D., Niedrite, L., Svilpe, L.: Managing evolution of heterogeneous data sources of a data warehouse. In: Proceedings of the 23rd International Conference on Enterprise Information Systems, ICEIS 2021, vol. 1, pp. 1–2. Online Streaming (2021)

    Google Scholar 

  21. Solodovnikova, D., Niedrite, L.: Towards a data warehouse architecture for managing big data evolution. In: Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), Porto, Portugal, pp. 63–70 (2018)

    Google Scholar 

  22. Solodovnikova, D., Niedrite, L.: Change discovery in heterogeneous data sources of a data warehouse. In: Robal, T., Haav, H.-M., Penjam, J., Matulevičius, R. (eds.) DB&IS 2020. CCIS, vol. 1243, pp. 23–37. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57672-1_3

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Darja Solodovnikova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Solodovnikova, D., Niedrite, L., Svilpe, L. (2022). An Approach to Evolution Management in Integrated Heterogeneous Data Sources. In: Filipe, J., Śmiałek, M., Brodsky, A., Hammoudi, S. (eds) Enterprise Information Systems. ICEIS 2021. Lecture Notes in Business Information Processing, vol 455. Springer, Cham. https://doi.org/10.1007/978-3-031-08965-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08965-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08964-0

  • Online ISBN: 978-3-031-08965-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics