skip to main content
10.1145/1065167.1065171acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

XML data exchange: consistency and query answering

Published: 13 June 2005 Publication History

Abstract

Data exchange is the problem of finding an instance of a target schema, given an instance of a source schema and a specification of the relationship between the source and the target. Theoretical foundations of data exchange have recently been investigated for relational data.In this paper, we start looking into the basic properties of XML data exchange, that is, restructuring of XML documents that conform to a source DTD under a target DTD, and answering queries written over the target schema. We define XML data exchange settings in which source-to-target dependencies refer to the hierarchical structure of the data. Combining DTDs and dependencies makes some XML data exchange settings inconsistent. We investigate the consistency problem and determine its exact complexity.We then move to query answering, and prove a dichotomy theorem that classifies data exchange settings into those over which query answering is tractable, and those over which it is coNP-complete, depending on classes of regular expressions used in DTDs. Furthermore, for all tractable cases we give polynomial-time algorithms that compute target XML documents over which queries can be answered.

References

[1]
S. Abiteboul, O. Duschka. Complexity of answering queries using materialized views. In PODS 1998, pages 254--263.
[2]
S. Abiteboul, P. Kanellakis, G. Grahne. On the representation and querying of sets of possible worlds. TCS 78 (1991), 158--187.
[3]
S. Abiteboul, L. Segoufin, V. Vianu. Representing and querying XML with incomplete information. In PODS'01, pages 150--161.
[4]
S. Amer-Yahia, S. Cho, L. Lakshmanan, D. Srivastava. Tree pattern query minimization. VLDB J. 11 (2002), 315--331.
[5]
S. Amer-Yahia, Y. Kotidis. Web-services architecture for efficient XML data exchange. In ICDE 2004, pages 523--534.
[6]
M. Arenas, P. Barceló, R. Fagin, L. Libkin. Locally consistent transformations and query answering in data exchange. In PODS 2004, pages 229--240.
[7]
M. Benedikt, W. Fan, G. Kuper. Structural properties of XPath fragments. In ICDT 2003, pages 79--95.
[8]
H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison and M. Tommasi. Tree Automata: Techniques and Applications. Available at www.grappa.univ-lille3.fr/tata. October 2002.
[9]
A. Deutsch, V. Tannen. Containment and integrity constraints for XPath. In KRDB 2001.
[10]
R. Fagin, Ph. Kolaitis, R. Miller, L. Popa. Data exchange: semantics and query answering. In ICDT'03, pp. 207--224.
[11]
R. Fagin, Ph. Kolaitis, L. Popa. Data exchange: getting to the core. In PODS'03, pages 90--101.
[12]
R. Fagin, Ph. Kolaitis, L. Popa, W. C. Tan. Composing schema mappings: second-order dependencies to the rescue. PODS 2004, pages 83--94
[13]
G. Gottlob, C. Koch, K. Schulz. Conjunctive queries over trees. PODS 2004, pages 189--200.
[14]
T. Imielinski, W. Lipski. Incomplete information in relational databases. J. ACM 31 (1984), 761--791.
[15]
R. Krishnamurthy, R. Kaushik, J. Naughton. XML-SQL query translation literature: the state of the art and open problems. In Xsym 2003, pages 1--18.
[16]
L. Lakshmanan, G. Ramesh, H. Wang, Z. Zhao. On testing satisfiability of tree pattern queries. VLDB 2004, pages 120--131.
[17]
H. W. Lenstra. Integer programming in a fixed number of variables. Math. Oper. Res. 8 (1983), 538--548.
[18]
M. Lenzerini. Data integration: a theoretical perspective. In PODS'02, pages 233--246.
[19]
R. Miller, M. Hernandez, L. Haas, L. Yan, C. Ho, R. Fagin, L. Popa. The Clio project: managing heterogeneity. SIGMOD Record 30 (2001), 78--83.
[20]
F. Neven. Automata, logic, and XML. In CSL 2002, pages 2--26.
[21]
F. Neven, T. Schwentick. XPath containment in the presence of disjunction, DTDs, and variables. In ICDT'03, pages 315--329.
[22]
C. H. Papadimitriou. On the complexity of integer programming. J. ACM, 28 (1981), 765--768.
[23]
L. Popa, Y. Velegrakis, R. Miller, M. Hernandez, R. Fagin. Translating web data. In VLDB 2002, pages 598--609.
[24]
H. Seidl. Deciding equivalence of finite tree automata. SIAM J. Comput. 19 (1990), 424--437.
[25]
N. Shu, B. Housel, R. Taylor, S. Ghosh, V. Lum. EXPRESS: a data extraction, processing, and restructuring system. TODS 2 (1977), 134--174.
[26]
V. Vianu. A Web Odyssey: from Codd to XML. In PODS'01.
[27]
P. Wood. Containment for XPath fragments under DTD Constraints. In ICDT'03, pages 300--314.
[28]
C. Yu, L. Popa. Constraint-based XML query rewriting for data integration. In SIGMOD'04, pages 371--382.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2005
388 pages
ISBN:1595930620
DOI:10.1145/1065167
  • General Chair:
  • Georg Gottlob,
  • Program Chair:
  • Foto Afrati
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS05

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media