skip to main content
10.1145/1807085.1807117acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Schema design for XML repositories: complexity and tractability

Published: 06 June 2010 Publication History

Abstract

Abiteboul et al. initiated the systematic study of distributed XML documents consisting of several logical parts, possibly located on different machines. The physical distribution of such documents immediately raises the following question: how can a global schema for the distributed document be broken up into local schemas for the different logical parts? The desired set of local schemas should guarantee that, if each logical part satisfies its local schema, then the distributed document satisfies the global schema.
Abiteboul et al. proposed three levels of desirability for local schemas: local typing, maximal local typing, and perfect local typing. Immediate algorithmic questions are: (i) given a typing, determine whether it is local, maximal local, or perfect, and (ii) given a document and a schema, establish whether a (maximal) local or perfect typing exists. This paper improves the open complexity results in their work and initiates the study of (i) and (ii) for schema restrictions arising from the current standards: DTDs and XML Schemas with deterministic content models. The most striking result is that these restrictions yield tractable complexities for the perfect typing problem.
Furthermore, an open problem in Formal Language Theory is settled: deciding language primality for deterministic finite automata is pspace-complete.

References

[1]
S. Abiteboul, G. Gottlob, and M. Manna. Distributed XML design. In ACM PODS, pages 247--258, 2009.
[2]
S.V. Avgustinovich and A. Frid. A unique decomposition theorem for factorial languages. Int. J. of Algebra and Comput., 15:149--160, 2005.
[3]
Sebastian Bala. Regular language matching and other decidable cases of the satisfiability problem for constraints between regular open terms. In STACS, pages 596--607, 2004.
[4]
G. J. Bex, W. Gelade, W. Martens, and F. Neven. Simplifying XML Schema: effortless handling of nondeterministic regular expressions. In ACM SIGMOD, pages 731--744, 2009.
[5]
T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Extensible Markup Language XML 1.0 (fifth edition). Technical report, World Wide Web Consortium (W3C), November 2008. W3C Recommendation, http://www.w3.org/TR/2008/REC-xml-20081126/.
[6]
A. Brüggemann-Klein and D. Wood. One-unambiguous regular languages. Inf. and Comput., 142(2):182--206, 1998.
[7]
D. Calvanese, G. De Giacomo, M. Lenzerini, and M.Y. Vardi. Rewriting of regular expressions and regular path queries. J. Comp. Syst. Sc., 64(3):443--465, 2002.
[8]
J. Clark and M. Murata. Relax NG specification. http://www.relaxng.org/spec-20011203.html, December 2001.
[9]
J.H. Conway. Regular Algebra and Finite Machines. Chapman and Hall, 1971.
[10]
J. Czyzowicz, W. Fraczak, A. Pelc, and W. Rytter. Linear-time prime decompositions of regular prefix codes. Int. J. Found. Comp. Sc., 14:1019--1031, 2003.
[11]
D. Fallside and P. Walmsley. XML Schema Part 0: Primer (second edition). Technical report, World Wide Web Consortium, October 2004. http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/.
[12]
S. Gao, C. M. Sperberg-McQueen, H.S. Thompson, N. Mendelsohn, D. Beech, and M. Maloney. W3C XML Schema Definition Language (XSD) 1.1 part 1: Structures. Technical report, World Wide Web Consortium, April 2009. W3C Recommendation, http://www.w3.org/TR/2009/CR-xmlschema11-1-20090430/.
[13]
Y.-S. Han, K. Salomaa, and D. Wood. Prime decompositions of regular languages. In DLT, pages 145--155, 2006.
[14]
T. Jiang and B. Ravikumar. Minimal NFA problems are hard. Siam J. Comp., 22(6):1117--1141, 1993.
[15]
M. Kunc. What do we know about language equations? In DLT, pages 23--27, 2007.
[16]
W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for XML schemas and chain regular expressions. Siam J. Comp., 39(4):1486--1530, 2009.
[17]
Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In ACM PODS, pages 35--46, 2000.
[18]
A. Salomaa, K. Salomaa, and S. Yu. Length codes, products of languages and primality. In LATA, pages 476--486, 2008.
[19]
A. Salomaa and S. Yu. On the decomposition of finite languages. In DLT, pages 22--31, 1999.
[20]
K. Salomaa. Language decompositions, primality, and trajectory-based operations. In CIAA, pages 17--22, 2008.
[21]
P. van Emde Boas. The convenience of tilings. In A. Sorbi, editor, Complexity, Logic and Recursion Theory, volume 187 of Lecture Notes in Pure and Applied Mathematics, pages 331--363. Marcel Dekker Inc., 1997.
[22]
M. Y. Vardi. An automata-theoretic approach to linear temporal logic. In BANFF, pages 238--266, 1995.
[23]
W. Wieczorek. An algorithm for the decomposition of finite languages. Logic J. of the IGPL, 2009. Appeared on-line August 8, 2009.
[24]
S. Yu. Regular languages. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1, chapter 2. Springer, 1997.

Cited By

View all
  • (2022)An adaptive parallel algorithm for finite language decompositionApplied Intelligence10.1007/s10489-021-02488-y52:3(3029-3050)Online publication date: 1-Feb-2022
  • (2019)Split-Correctness in Information ExtractionProceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3294052.3319684(149-163)Online publication date: 25-Jun-2019
  • (2018)Propagating Regular Membership with Dashed StringsPrinciples and Practice of Constraint Programming10.1007/978-3-319-98334-9_2(13-29)Online publication date: 23-Aug-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2010
350 pages
ISBN:9781450300339
DOI:10.1145/1807085
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. complexity
  2. language primality
  3. xml
  4. xml schemas

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '10
Sponsor:
SIGMOD/PODS '10: International Conference on Management of Data
June 6 - 11, 2010
Indiana, Indianapolis, USA

Acceptance Rates

PODS '10 Paper Acceptance Rate 27 of 113 submissions, 24%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)2
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)An adaptive parallel algorithm for finite language decompositionApplied Intelligence10.1007/s10489-021-02488-y52:3(3029-3050)Online publication date: 1-Feb-2022
  • (2019)Split-Correctness in Information ExtractionProceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3294052.3319684(149-163)Online publication date: 25-Jun-2019
  • (2018)Propagating Regular Membership with Dashed StringsPrinciples and Practice of Constraint Programming10.1007/978-3-319-98334-9_2(13-29)Online publication date: 23-Aug-2018
  • (2013)On simplification of schema mappingsJournal of Computer and System Sciences10.1016/j.jcss.2013.01.00579:6(816-834)Online publication date: 1-Sep-2013
  • (2012)Foundations of regular expressions in XML schema languages and SPARQLProceedings of the on SIGMOD/PODS 2012 PhD Symposium10.1145/2213598.2213609(39-44)Online publication date: 20-May-2012
  • (2012)Descriptional complexity of deterministic regular expressionsProceedings of the 37th international conference on Mathematical Foundations of Computer Science10.1007/978-3-642-32589-2_56(643-654)Online publication date: 27-Aug-2012
  • (2011)On language decompositions and primalityRainbow of computer science10.5555/2001113.2001120(63-75)Online publication date: 1-Jan-2011
  • (2011)Simplifying schema mappingsProceedings of the 14th International Conference on Database Theory10.1145/1938551.1938568(114-125)Online publication date: 21-Mar-2011
  • (2011)On Language Decompositions and PrimalityRainbow of Computer Science10.1007/978-3-642-19391-0_5(63-75)Online publication date: 2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media