Abstract
In the traditional data exchange setting, source instances are restricted to be complete in the sense that every fact is either true or false in these instances. Although natural for a typical database translation scenario, this restriction is gradually becoming an impediment to the development of a wide range of applications that need to exchange objects that admit several interpretations. In particular, we are motivated by two specific applications that go beyond the usual data exchange scenario: exchanging incomplete information and exchanging knowledge bases.
In this article, we propose a general framework for data exchange that can deal with these two applications. More specifically, we address the problem of exchanging information given by representation systems, which are essentially finite descriptions of (possibly infinite) sets of complete instances. We make use of the classical semantics of mappings specified by sets of logical sentences to give a meaningful semantics to the notion of exchanging representatives, from which the standard notions of solution, space of solutions, and universal solution naturally arise. We also introduce the notion of strong representation system for a class of mappings, that resembles the concept of strong representation system for a query language. We show the robustness of our proposal by applying it to the two applications mentioned above: exchanging incomplete information and exchanging knowledge bases, which are both instantiations of the exchanging problem for representation systems. We study these two applications in detail, presenting results regarding expressiveness, query answering and complexity of computing solutions, and also algorithms to materialize solutions.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Data exchange beyond complete data
- Abiteboul, S., Hull, R., and Vianu, V. 1995. Foundations of Databases. Addison-Wesley. Google ScholarDigital Library
- Abiteboul, S., Kanellakis, P. C., and Grahne, G. 1991. On the representation and querying of sets of possible worlds. Theoret. Comput. Sci. 78, 1, 158--187. Google ScholarDigital Library
- Afrati, F., Li, C., and Pavlaki, V. 2008. Data exchange: Query answering for incomplete data sources. In Proceedings of InfoScale. Google ScholarDigital Library
- Antova, L., Koch, C., and Olteanu, D. 2007. 10106 worlds and beyond: Efficient representation and processing of incomplete information. In Proceedings of ICDE. 606--615.Google Scholar
- Arenas, M., Botoeva, E., and Calvanese, D. 2011. Knowledge base exchange. In Description Logics.Google Scholar
- Arenas, M., Botoeva, E., Calvanese, D., Ryzhikov, V., and Sherkhonov, E. 2012a. Exchanging description logic knowledge bases. In Proceedings of KR.Google Scholar
- Arenas, M., Botoeva, E., Calvanese, D., Ryzhikov, V., and Sherkhonov, E. 2012b. Representability in DL-liteR knowledge base exchange. In Description Logics.Google Scholar
- Arenas, M., Pérez, J., and Reutter, J. L. 2011. Data exchange beyond complete data. In Proceedings of PODS. 83--94. Google ScholarDigital Library
- Arenas, M., Pérez, J., Reutter, J. L., and Riveros, C. 2009a. Inverting schema mappings: bridging the gap between theory and practice. Proc. VLDB 2, 1, 1018--1029. Google ScholarDigital Library
- Arenas, M., Pérez, J., Reutter, J. L., and Riveros, C. 2010. Foundations of schema mapping management. In Proceedings of PODS. 227--238. Google ScholarDigital Library
- Arenas, M., Pérez, J., Reutter, J. L., and Riveros, C. 2013. The language of plain SO-TGDS: Composition, inversion and structural properties. J. Comput. System Sci. 79, 6, 763--784. Google ScholarDigital Library
- Arenas, M., Pérez, J., and Riveros, C. 2009b. The recovery of a schema mapping: Bringing exchanged data back. Trans. Datab. Syst. 34, 4. Google ScholarDigital Library
- Beeri, C. and Vardi, M. 1984. A proof procedure for data dependencies. J. ACM 31, 4, 718--741. Google ScholarDigital Library
- Bernstein, P. 2003. Applying model management to classical meta data problems. In Proceedings of CIDR.Google Scholar
- Bernstein, P. and Melnik, S. 2007. Model management 2.0: manipulating richer mappings. In Proceedings of SIGMOD. 1--12. Google ScholarDigital Library
- Buss, S. R. and Hay, L. 1991. On truth-table reducibility to SAT. Inf. Comput. 91, 1, 86--102. Google ScholarDigital Library
- Ceri, S., Gottlob, G., and Tanca, L. 1989. What you always wanted to know about datalog (and never dared to ask). IEEE Trans. Knowl. Data Eng. 1, 1, 146--166. Google ScholarDigital Library
- Dawar, A. 1998. A restricted second order logic for finite structures. Inf. Comput. 143, 2, 154--174. Google ScholarDigital Library
- Deutsch, A. and Tannen, V. 2003. Reformulation of XML queries and constraints. In Proceedings of ICDT. 225--241. Google ScholarDigital Library
- Fagin, R. 2007. Inverting schema mappings. Trans. Datab. Syst. 32, 4. Google ScholarDigital Library
- Fagin, R., Kolaitis, P. G., Miller, R., and Popa, L. 2005a. Data exchange: semantics and query answering. Theor. Comput. Sci. 336, 1, 89--124. Google ScholarDigital Library
- Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W.-C. 2005b. Composing schema mappings: Second-order dependencies to the rescue. Trans. Datab. Syst. 30, 4, 994--1055. Google ScholarDigital Library
- Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W.-C. 2007. Quasi-inverses of schema mappings. In Proceedings of PODS. 123--132. Google ScholarDigital Library
- Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W.-C. 2009. Reverse data exchange: coping with nulls. In Proceedings of PODS. 23--32. Google ScholarDigital Library
- Grahne, G. 1991. The Problem of Incomplete Information in Relational Databases. Springer. Google ScholarDigital Library
- Grahne, G. and Onet, A. 2011. Closed world chasing. In Proceedings of LID. 7--14. Google ScholarDigital Library
- Green, T. J., Karvounarakis, G., and Tannen, V. 2007. Provenance semirings. In Proceedings of PODS. 31--40. Google ScholarDigital Library
- Hayes, P. February 2004. RDF Semantics, W3C Recommendation. http://www.w3.org/TR/rdf-mt.Google Scholar
- Imielinski, T. and Lipski, W. 1984. Incomplete information in relational databases. J. ACM 31, 4, 761--791. Google ScholarDigital Library
- Kolaitis, P. G., Panttaja, J., and Tan, W.-C. 2006. The complexity of data exchange. In Proceedings of PODS. 30--39. Google ScholarDigital Library
- Libkin, L. 2006. Data exchange and incomplete information. In Proceedings of PODS. 60--69. Google ScholarDigital Library
- Libkin, L. and Sirangelo, C. 2008. Data exchange and schema mappings in open and closed worlds. In Proceedings of PODS. 139--148. Google ScholarDigital Library
- Nash, A., Bernstein, P., and Melnik, S. 2005. Composition of mappings given by embedded dependencies. In Proceedings of PODS. 172--183. Google ScholarDigital Library
- Patel-Schneider, P., Hayes, P., and Horrocks, I. February 2004. OWL Web Ontology Language, W3C Recommendation. http://www.w3.org/TR/owl-semantics/.Google Scholar
- Sagiv, Y. 1988. Optimizing datalog programs. In Foundations of Deductive Databases and Logic Programming, Morgan Kaufmann. Google ScholarDigital Library
- ten Cate, B. and Kolaitis, P. G. 2010. Structural characterizations of schema-mapping languages. Comm. ACM 53, 1, 101--110. Google ScholarDigital Library
- Ullman, J. D. 1997. Information integration using logical views. In Proceedings of ICDT. 19--40. Google ScholarDigital Library
- Vardi, M. 1982. The complexity of relational query languages. In Proceedings of STOC. 137--146. Google ScholarDigital Library
- Wagner, K. W. 1987. More complicated questions about maxima and minima, and some closures of NP. Theoret. Comput. Sci. 51, 53--80. Google ScholarDigital Library
- Wagner, K. W. 1990. Bounded Query Classes. SIAM J. Comput. 19, 5, 833--846. Google ScholarDigital Library
Index Terms
- Data exchange beyond complete data
Recommendations
Data exchange: getting to the core
Special Issue: SIGMOD/PODS 2003Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange ...
Peer data exchange
In this article, we introduce and study a framework, called peer data exchange, for sharing and exchanging data between peers. This framework is a special case of a full-fledged peer data management system and a generalization of data exchange between a ...
Data exchange beyond complete data
PODS '11: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsIn the traditional data exchange setting, source instances are restricted to be complete in the sense that every fact is either true or false in these instances. Although natural for a typical database translation scenario, this restriction is gradually ...
Comments