Abstract
This paper studies certain transformations of XML schemas, which are widely used in algorithms of the XML data management. In view of the fact that properties and functional characteristics of the XML documents considerably differ from those of data of other type, the solutions of a number of typical data management problems (such as the XML data validation, schema inference, and data translation to/from other models) for them are more complicated. The general idea of our approach to solving these problems is to transform the original structure (i.e., structural schema constraints) into another structure without loss of information about properties of the original data that are important for applications. The suggested technique has been successfully used in various algorithms for solving problems of this kind. In this paper, a systematic approach to solving these problems is discussed. Methods for reducing the XML schemas to several canonical forms are presented, and algorithms of solving the management problems for data satisfying schemas represented in the canonical forms are examined.
Similar content being viewed by others
REFERENCES
Extensible Markup Language (XML) 1.0 (Second Edition), W3C Recommendation, Oct. 6, 2000, http://www.w3.org/TR/2000/REC-xml-20001006.
Xquery 1.0 and XPath 2.0 Data Model, W3C Working Draft, Nov. 15, 2002, http://www.w3.org/TR/querydatamodel/.
Boumphrey, F. et al., XML Applications, Wrox, 2000. Translated under the title Novye perspektivy WWW, Moscow: DMK, 2000.
Bourret, R., XML and Databases, http://www.rpbourret. com/xml/XMLAndDatabases.htm.
Suciu, D., Semistructured Data and XML, Proc. of the Int. Conf. on Foundations of Data Organization, 1998.
XML Schema. Part 1: Structures. W3C Recommendation, Thompson, H.S., Beech, D., Maloney, M., and Mendelsohn, N., Eds., May 2001; http://www.w3.org/TR/xmlschema-1/.
RELAX NG Tutorial, OASIS Working Draft, Clark, J. and Murata, M., Eds., June 2001, http://www.oasisopen.org/ committees/relax-ng/tutorial.html.
XML Schema Developer's Guide Internet Document, Microsoft, May 2001; http://wsdn.microsoft.com/xml/ XMLGuide/schema-over-view.asp.
Lee, D. and Chu, W.W., Comparative Analysis of Six XML Schema Languages, SIGMOD Record, 2000, vol.29, no. 3, pp. 76-87.
Fan, W. and Simeon, J., Integrity Constraints for XML, ACM PODS, Dallas 2000.
Murata, M., Lee, D., and Mani, M., Taxonomy of XML Schema Languages Using Formal Language Theory. Extreme Markup Languages, Montreal, 2001.
Fan, W. and Libkin, L., On XML Integrity Constraints in the Presence of DTDS, Proc. of ACM PODS, 2001.
Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., and Naughton, J., Relational Databases for Querying XML Documents: Limitations and Opportunities, VLDB, Edinburgh, 1999.
Mani, M. and Lee, D., XML to Relational Conversion Using Theory of Regular Tree Grammars, Proc. of the 28th VLDB Conf., Hong Kong, 2002.
Florescu, D. and Kossmann, D., Storing and Querying XML Data Using RDBMS, IEEE Data Eng. Bull., 1999, vol. 22, no. 3, pp. 27-34.
Shimura, T., Yoshikawa, M., and Uemura, S., Storage and Retrieval of XML Documents Using Object-Relational Databases, Int. Conf. on Database and Expert Systems Appl. (DEXA), Florence, 1999, pp. 206-217.
Deutsch, A., Fernandez, M.F., and Suciu, D., Storing Semistructured Data with STORED, ACM SIGMOD, Philadelphia, 1998.
Bohannon, P., Freire, J., Roy, P., and Simeon, J., From XML Schema to Relations: A Cost-Based Approach to XML Storage, IEEE ICDE San Jose, 2002.
Cluet, S. et al., Your Mediator Needs Data Conversion!, Proc. of the ACM SIGMOD Conf. on Management of Data, Washington, 1997, pp. 177-188.
Novak, L., Mediation System Implementation Based on Specification of XML Schema Integration: Generic Approach (to appear).
Rahm, E. and Bernstein, P.A., A Survey of Approaches to Automatic Schema Matching, VLDB J., 2001, vol. 10, no. 4, pp. 334-350.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Novak, L.G., Kuznetsov, S.D. Canonical Forms of XML Schemas. Programming and Computer Software 29, 283–293 (2003). https://doi.org/10.1023/A:1025741309518
Issue Date:
DOI: https://doi.org/10.1023/A:1025741309518