Abstract
Data-intensive ecosystems are conglomerations of data repositories surrounded by applications that depend on them for their operation. In this paper, we address the problem of performing what-if analysis for the evolution of the database part of a data-intensive ecosystem, to identify what other parts of an ecosystem are affected by a potential change in the database schema, and how will the ecosystem look like once the change has been performed, while, at the same time, retaining the ability to regulate the flow of events. We model the ecosystem as a graph, uniformly covering relations, views, and queries as nodes and their internal structure and interdependencies as the edges of the graph. We provide a simple language to annotate the modules of the graph with policies for their response to evolutionary events to regulate the flow of events and their impact by (i) vetoing (“blocking”) the change in parts that the developers want to retain unaffected and (ii) allowing (“propagating”) the change in parts that we need to adapt to the new schema. Our method for the automatic adaptation of ecosystems is based on three algorithms that automatically (i) assess the impact of a change, (ii) compute the need of different variants of an ecosystem’s components, depending on policy conflicts, and (iii) rewrite the modules to adapt to the change. We theoretically prove the coverage of the language, as well as the termination, consistency, and confluence of our algorithms and experimentally verify our methods effectiveness and efficiency.















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Well-known constraints of database relations—i.e., primary/foreign key, unique, not null, and check constraints—can also be captured by this modeling technique. Foreign keys are subset relations of the source and the target attribute, and check constraints are simple value-based conditions. Primary keys, which are unique-value constraints, are explicitly represented through a dedicated node tagged by their names and a single operand node.
References
Cleve A, Brogneaux AF, Hainaut J-L (2010) A conceptual approach to database applications evolution. In: 29th international conference on conceptual modeling (ER), Vancouver, pp 132–145
Curino C, Moon HJ, Deutsch A, Zaniolo C (2010) Update rewriting and integrity constraint maintenance in a schema evolution support system: PRISM++. PVLDB 4(2):117–128
Curino C, Moon HJ, Deutsch A, Zaniolo C (2013) Automating the database schema evolution process. VLDB J 22(1):73–98
Drupal Community (2014) Drupal. http://ftp.drupal.org/files/projects/
Foster JN, Greenwald MB, Moore JT, Pierce BC, Schmitt A (2007) Combinators for bidirectional tree transformations: a linguistic approach to the view-update problem. ACM Trans Program Lang Syst 29(3)
Gallagher K, Binkley D (2008) Program slicing. In: Frontiers of software maintenance. IEEE CS Press, New York
Golfarelli M, Lechtenbörger J, Rizzi S, Vossen G (2006) Schema versioning in data warehouses: enabling cross-version querying via schema augmentation. Data Knowl Eng 59(2):435–459
Gupta A, Mumick IS, Rao J, Ross KA (2001) Adapting materialized views after redefinitions: techniques and a performance study. Inf Syst 26(5):323–362
Hartung M, Terwilliger JF, Rahm E (2011) Recent advances in schema and ontology evolution. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping. Springer, New York, pp 149–190
Manousis P (2013) Database evolution and maintenance of dependent applications via query rewriting. MSc. Thesis, Department of Computer Science, University of Ioannina. http://www.cs.uoi.gr/~pmanousi/publications.html
Manousis P, Vassiliadis P, Papastefanatos G (2013) Automating the adaptation of evolving data-intensive ecosystems. In: 32th international conference on conceptual modeling (ER), Hong-Kong pp 182–196
Maule A, Emmerich W, Rosenblum DS (2008) Impact analysis of database schema changes. In: 30th international conference on software engineering (ICSE 2008), Leipzig, pp 451–460
McBrien P, Poulovassilis A (2003) Data integration by bi-directional schema transformation rules. In: Proceedings of the 19th international conference on data engineering, 5–8 March 2003, Bangalore, pp 227–238
Nica A, Lee AJ, Rundensteiner EA (1998) The CVS algorithm for view synchronization in evolvable large-scale information systems. In: 6th international conference on extending database technology (EDBT 1998), Valencia, pp 359–373
Papastefanatos G, Kyzirakos K, Vassiliadis P, Vassiliou Y (2005) Hecataeus: a framework for representing SQL constructs as graphs. In: Proceedings of 10th international workshop on exploring modeling methods for systems analysis and design-EMMSAD, Porto
Papastefanatos G, Vassiliadis P, Simitsis A (2011) Propagating evolution events in data-centric software artifacts. In: ICDE workshops, pp 162–167
Papastefanatos G, Vassiliadis P, Simitsis A, Aggistalis K, Pechlivani F, Vassiliou Y (2008) Language extensions for the automation of database schema evolution. In: Proceedings of the ICEIS (1), Barcelona, pp 74–81
Papastefanatos G, Vassiliadis P, Simitsis A, Vassiliou Y (2008) Design metrics for data warehouse evolution. In: 27th international conference on conceptual modeling (ER), Barcelona, pp 440–454
Papastefanatos G, Vassiliadis P, Simitsis A, Vassiliou Y (2009) Policy-regulated management of ETL evolution. J Data Semant 13:147–177
Papastefanatos G, Vassiliadis P, Simitsis A, Vassiliou Y (2010) HECATAEUS: regulating schema evolution. In: Proceedings of the 26th international conference on data engineering (ICDE), Long Beach, pp 1181–1184
Pressman R (2000) Software engineering: a practitioner’s approach: European adaption, 5th edn. McGraw-Hill, New York
Ram S, Shankaranarayanan G (2003) Research issues in database schema evolution: the road not taken. In: Working paper, Department of Information Systems, Boston University School of Management. http://smgapps.bu.edu/smgnet/Personal/Faculty/Publication/pubUploads/Shankar,_G_15.pdf?wid=1536
Roddick JF (1992) Schema evolution in database systems—an annotated bibliography. SIGMOD Rec 21(4):35–40
Terwilliger JF, Cleve A, Curino C (2012) How clean is your sandbox?—towards a unified theoretical framework for incremental bidirectional transformations. In: 5th international conference on theory and practice of model transformations (ICMT), Prague, pp 1–23
Terwilliger JF, Delcambre LML, Maier D, Steinhauer J, Britell S (2010) Updatable and evolvable transforms for virtual databases. PVLDB 3(1):309–319
Transaction Processing Performance Council (2012) The new decision support benchmark standard. http://www.tpc.org/tpcds/default.asp
Velegrakis Y, Miller RJ, Popa L (2004) Preserving mapping consistency under schema changes. VLDB J 13(3):274–293
Winsemann T, Köppen V, Saake G (2012) A layered architecture for enterprise data warehouse systems. In: Bajec M, Eder J (eds) CAiSE workshops. Lecture notes in business information processing, vol 112, pp 192–199. Springer, New York
Wrembel R, Bebel B (2007) Metadata management in a multiversion data warehouse. J Data Semant 8:118–157
Xing Z, Stroulia E (2005) Analyzing the evolutionary history of the logical design of object-oriented software. IEEE Trans Softw Eng 31(10):850–868
Acknowledgments
We would like to thank the reviewers of this paper for their constructive comments. This research has been co-financed by the European Union (European Social Fund—ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Program: Thales. Investing in knowledge society through the European Social Fund.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Manousis, P., Vassiliadis, P. & Papastefanatos, G. Impact Analysis and Policy-Conforming Rewriting of Evolving Data-Intensive Ecosystems. J Data Semant 4, 231–267 (2015). https://doi.org/10.1007/s13740-015-0050-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13740-015-0050-3