Skip to main content
Log in

Impact Analysis and Policy-Conforming Rewriting of Evolving Data-Intensive Ecosystems

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

Data-intensive ecosystems are conglomerations of data repositories surrounded by applications that depend on them for their operation. In this paper, we address the problem of performing what-if analysis for the evolution of the database part of a data-intensive ecosystem, to identify what other parts of an ecosystem are affected by a potential change in the database schema, and how will the ecosystem look like once the change has been performed, while, at the same time, retaining the ability to regulate the flow of events. We model the ecosystem as a graph, uniformly covering relations, views, and queries as nodes and their internal structure and interdependencies as the edges of the graph. We provide a simple language to annotate the modules of the graph with policies for their response to evolutionary events to regulate the flow of events and their impact by (i) vetoing (“blocking”) the change in parts that the developers want to retain unaffected and (ii) allowing (“propagating”) the change in parts that we need to adapt to the new schema. Our method for the automatic adaptation of ecosystems is based on three algorithms that automatically (i) assess the impact of a change, (ii) compute the need of different variants of an ecosystem’s components, depending on policy conflicts, and (iii) rewrite the modules to adapt to the change. We theoretically prove the coverage of the language, as well as the termination, consistency, and confluence of our algorithms and experimentally verify our methods effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. http://www.cs.uoi.gr/~pvassil/projects/hecataeus/.

  2. Well-known constraints of database relations—i.e., primary/foreign key, unique, not null, and check constraints—can also be captured by this modeling technique. Foreign keys are subset relations of the source and the target attribute, and check constraints are simple value-based conditions. Primary keys, which are unique-value constraints, are explicitly represented through a dedicated node tagged by their names and a single operand node.

  3. http://www.cs.uoi.gr/~pmanousi/publications/2013_ER/index.html.

  4. http://dbs.uni-leipzig.de/en/publications.

References

  1. Cleve A, Brogneaux AF, Hainaut J-L (2010) A conceptual approach to database applications evolution. In: 29th international conference on conceptual modeling (ER), Vancouver, pp 132–145

  2. Curino C, Moon HJ, Deutsch A, Zaniolo C (2010) Update rewriting and integrity constraint maintenance in a schema evolution support system: PRISM++. PVLDB 4(2):117–128

    Google Scholar 

  3. Curino C, Moon HJ, Deutsch A, Zaniolo C (2013) Automating the database schema evolution process. VLDB J 22(1):73–98

    Article  Google Scholar 

  4. Drupal Community (2014) Drupal. http://ftp.drupal.org/files/projects/

  5. Foster JN, Greenwald MB, Moore JT, Pierce BC, Schmitt A (2007) Combinators for bidirectional tree transformations: a linguistic approach to the view-update problem. ACM Trans Program Lang Syst 29(3)

  6. Gallagher K, Binkley D (2008) Program slicing. In: Frontiers of software maintenance. IEEE CS Press, New York

  7. Golfarelli M, Lechtenbörger J, Rizzi S, Vossen G (2006) Schema versioning in data warehouses: enabling cross-version querying via schema augmentation. Data Knowl Eng 59(2):435–459

    Article  Google Scholar 

  8. Gupta A, Mumick IS, Rao J, Ross KA (2001) Adapting materialized views after redefinitions: techniques and a performance study. Inf Syst 26(5):323–362

    Article  MATH  Google Scholar 

  9. Hartung M, Terwilliger JF, Rahm E (2011) Recent advances in schema and ontology evolution. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping. Springer, New York, pp 149–190

    Chapter  Google Scholar 

  10. Manousis P (2013) Database evolution and maintenance of dependent applications via query rewriting. MSc. Thesis, Department of Computer Science, University of Ioannina. http://www.cs.uoi.gr/~pmanousi/publications.html

  11. Manousis P, Vassiliadis P, Papastefanatos G (2013) Automating the adaptation of evolving data-intensive ecosystems. In: 32th international conference on conceptual modeling (ER), Hong-Kong pp 182–196

  12. Maule A, Emmerich W, Rosenblum DS (2008) Impact analysis of database schema changes. In: 30th international conference on software engineering (ICSE 2008), Leipzig, pp 451–460

  13. McBrien P, Poulovassilis A (2003) Data integration by bi-directional schema transformation rules. In: Proceedings of the 19th international conference on data engineering, 5–8 March 2003, Bangalore, pp 227–238

  14. Nica A, Lee AJ, Rundensteiner EA (1998) The CVS algorithm for view synchronization in evolvable large-scale information systems. In: 6th international conference on extending database technology (EDBT 1998), Valencia, pp 359–373

  15. Papastefanatos G, Kyzirakos K, Vassiliadis P, Vassiliou Y (2005) Hecataeus: a framework for representing SQL constructs as graphs. In: Proceedings of 10th international workshop on exploring modeling methods for systems analysis and design-EMMSAD, Porto

  16. Papastefanatos G, Vassiliadis P, Simitsis A (2011) Propagating evolution events in data-centric software artifacts. In: ICDE workshops, pp 162–167

  17. Papastefanatos G, Vassiliadis P, Simitsis A, Aggistalis K, Pechlivani F, Vassiliou Y (2008) Language extensions for the automation of database schema evolution. In: Proceedings of the ICEIS (1), Barcelona, pp 74–81

  18. Papastefanatos G, Vassiliadis P, Simitsis A, Vassiliou Y (2008) Design metrics for data warehouse evolution. In: 27th international conference on conceptual modeling (ER), Barcelona, pp 440–454

  19. Papastefanatos G, Vassiliadis P, Simitsis A, Vassiliou Y (2009) Policy-regulated management of ETL evolution. J Data Semant 13:147–177

    Article  Google Scholar 

  20. Papastefanatos G, Vassiliadis P, Simitsis A, Vassiliou Y (2010) HECATAEUS: regulating schema evolution. In: Proceedings of the 26th international conference on data engineering (ICDE), Long Beach, pp 1181–1184

  21. Pressman R (2000) Software engineering: a practitioner’s approach: European adaption, 5th edn. McGraw-Hill, New York

    Google Scholar 

  22. Ram S, Shankaranarayanan G (2003) Research issues in database schema evolution: the road not taken. In: Working paper, Department of Information Systems, Boston University School of Management. http://smgapps.bu.edu/smgnet/Personal/Faculty/Publication/pubUploads/Shankar,_G_15.pdf?wid=1536

  23. Roddick JF (1992) Schema evolution in database systems—an annotated bibliography. SIGMOD Rec 21(4):35–40

    Article  Google Scholar 

  24. Terwilliger JF, Cleve A, Curino C (2012) How clean is your sandbox?—towards a unified theoretical framework for incremental bidirectional transformations. In: 5th international conference on theory and practice of model transformations (ICMT), Prague, pp 1–23

  25. Terwilliger JF, Delcambre LML, Maier D, Steinhauer J, Britell S (2010) Updatable and evolvable transforms for virtual databases. PVLDB 3(1):309–319

    Google Scholar 

  26. Transaction Processing Performance Council (2012) The new decision support benchmark standard. http://www.tpc.org/tpcds/default.asp

  27. Velegrakis Y, Miller RJ, Popa L (2004) Preserving mapping consistency under schema changes. VLDB J 13(3):274–293

    Article  Google Scholar 

  28. Winsemann T, Köppen V, Saake G (2012) A layered architecture for enterprise data warehouse systems. In: Bajec M, Eder J (eds) CAiSE workshops. Lecture notes in business information processing, vol 112, pp 192–199. Springer, New York

  29. Wrembel R, Bebel B (2007) Metadata management in a multiversion data warehouse. J Data Semant 8:118–157

    Google Scholar 

  30. Xing Z, Stroulia E (2005) Analyzing the evolutionary history of the logical design of object-oriented software. IEEE Trans Softw Eng 31(10):850–868

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank the reviewers of this paper for their constructive comments. This research has been co-financed by the European Union (European Social Fund—ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Program: Thales. Investing in knowledge society through the European Social Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Petros Manousis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manousis, P., Vassiliadis, P. & Papastefanatos, G. Impact Analysis and Policy-Conforming Rewriting of Evolving Data-Intensive Ecosystems. J Data Semant 4, 231–267 (2015). https://doi.org/10.1007/s13740-015-0050-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-015-0050-3

Keywords

Navigation