ABSTRACT
A schema-mapping is a high level specification of a data-exchange setting where a set of source-to-target dependencies is used to realize basic operations from source to target relations (such as copy, selection, join or union) while the target schema is subject to a set of target constraints (such as inclusion dependencies or key constraints). In this paper, we consider strong schema-mappings that allow for additional constraints such as source dependencies on the source schema and target-to-source dependencies from the target relations back to the source. Furthermore, strong schema-mappings may include disjunctive dependencies. We argue that this extension is desirable when the source instance is to provide both a lower and upper bound on the information that a target instance can have.
We first focus on the implication problem for strong schema-mappings which is to determine whether a given constraint δ is logically implied by the set Σ of constraints (denoted by σ ⊨ δ). After providing complete characterizations for this problem in terms of universal solutions (while supporting equality constraints), we introduce criteria of termination, denoted by TOC, DTOC and MTOC, that allow the efficient computation of universal solutions for standard constraints, disjunctive constraints, and when the source instance is assumed to be immutable (i.e., it is master data), respectively. We obtain decision procedures for the implication problem, provided that Σ satisfies these termination conditions, and give the corresponding complexity bounds. As an immediate application we revisit the problems of determinacy, relative information completeness and variations thereof, all for strong schema-mappings. Indeed, by viewing them as implication problems we obtain efficient decision procedures when the relevant termination conditions are satisfied.
We then focus on the problem of deciding whether source-to-target constraints in a strong schema-mapping are already implied by the embedded (standard) schema-mapping. This problem is important if one wants to use target-to-source constraints in standard data-exchange tools. Since no such constraints are logically implied by standard schema-mappings (and hence the results established earlier are of no use), we provide an alternative semantics for implication. More specifically, we want the constraint to be satisfied by every solution corresponding to the output of a standard data-exchange tool. We consider three semantics based on universal solutions, cores and CWA-solutions, respectively. Decidability of the implication of general (resp. safe) target-to-source constraints is shown for the CWA-based semantics (resp. core-semantics).
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. 1995. Google ScholarDigital Library
- A. Calì, G. Gottlob, and T. Lukasiewicz. A general datalog-based framework for tractable query answering over ontologies. In PODS, pages 77--86, 2009. Google ScholarDigital Library
- S. Chaudhuri and M. Y. Vardi. On the equivalence of recursive and nonrecursive datalog programs. J. Comput. Syst. Sci., 54(1):61--78, 1997.Google ScholarDigital Library
- A. Deutsch, A. Nash, and J. B. Remmel. The chase revisited. In PODS, pages 149--158, 2008. Google ScholarDigital Library
- A. Deutsch and V. Tannen. Xml queries and constraints, containment and reformulation. Theor. Comput. Sci., 336(1):57--87, 2005. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: semantics and query answering. Theor. Comput. Sci., 336(1):89--124, 2005. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, and L. Popa. Data exchange: getting to the core. ACM Trans. Database Syst., 30(1):174--210, 2005. Google ScholarDigital Library
- W. Fan and F. Geerts. Relative information completeness. In PODS, pages 97--106, 2009. Google ScholarDigital Library
- A. Fuxman, P. G. Kolaitis, R. J. Miller, and W. C. Tan. Peer data exchange. ACM Trans. Database Syst., 31(4):1454--1498, 2006. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci., 64(3):579--627, 2002.Google ScholarDigital Library
- G. Gottlob and A. Nash. Efficient core computation in data exchange. J. ACM, 55(2), 2008. Google ScholarDigital Library
- L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio grows up: from research prototype to industrial tool. In SIGMOD, pages 805--810, 2005. Google ScholarDigital Library
- A. Hernich and N. Schweikardt. Cwa-solutions for data exchange settings with target dependencies. In PODS, pages 113--122, 2007. Google ScholarDigital Library
- P. G. Kolaitis. Schema mappings, data exchange, and metadata management. In PODS, pages 61--75, 2005. Google ScholarDigital Library
- L. Libkin. Data exchange and incomplete information. In PODS, pages 60--69, 2006. Google ScholarDigital Library
- L. Libkin and C. Sirangelo. Data exchange and schema mappings in open and closed worlds. In PODS, pages 139--148, 2008. Google ScholarDigital Library
- B. Marnette. Generalized schema-mappings: from termination to tractability. In PODS, pages 13--22, 2009. Google ScholarDigital Library
- M. Meier, M. Schmidt, and G. Lausen. On chase termination beyond stratification. PVLDB, 2(1):970--981, 2009. Google ScholarDigital Library
- A. Nash, L. Segoufin, and V. Vianu. Determinacy and rewriting of conjunctive queries using views: A progress report. In ICDT, pages 59--73, 2007. Google ScholarDigital Library
- Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. J. ACM, 27(4):633--655, 1980.Google ScholarDigital Library
- L. Segoufin and V. Vianu. Views and queries: determinacy and rewriting. In PODS, pages 49--60, 2005. Google ScholarDigital Library
- O. Shmueli. Equivalence of datalog queries is undecidable. J. Log. Program., 15(3):231--241, 1993.Google ScholarDigital Library
- B. ten Cate, L. Chiticariu, P. G. Kolaitis, and W. C. Tan. Laconic schema mappings: Computing the core with sql queries. PVLDB, 2(1):1006--1017, 2009. Google ScholarDigital Library
Index Terms
- Static analysis of schema-mappings ensuring oblivious termination
Recommendations
Composing schema mappings: Second-order dependencies to the rescue
Special Issue: SIGMOD/PODS 2004A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). A fundamental problem is composing schema mappings: given ...
Quasi-inverses of schema mappings
Schema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...
Quasi-inverses of schema mappings
PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsSchema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...
Comments