Abstract
XML is just becoming a standard for document processing and interchange on the Internet. In this context, XML’s DTD structuring mechanism is especially important. This paper addresses, among others, the problem of integration of several DTDs by introducing a grammar based model for XML based on so called xtrees and xschemes. Although it can be shown that we can always find a DTD describing the intersection of the languages defined by two arbitrarily given DTDs, the same is not true for the union. This result shows that pure DTDs are not sufficient for schema integration. The paper presents a solution to this problem by introducing and investigating a generalized model based on so called s-xschemes. Using these s-xschemes we can obtain additional information about XML documents belonging to certain classes, thus enabling computation and transformation of information more specifically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Abiteboul. Querying semi-structured data. In International Conference on Database Theory (ICDT), pages 1–18, Delphi, 1997.
S. Abiteboul. On views and XML. In ACM Symposium on Principles of Database Systems (PODS), pages 1–9, 1999. Invited talk.
S. Abiteboul, S. Cluet, V. Christophides, T. Milo, G. Moerkotte, and J. Siméon. Querying documents in object databases. Journal on Digital Libraries, 1(1):5–19, 1997.
S. Abiteboul, S. Cluet, and T. Milo. A logical view of structured files. VLDB Journal, 7(2):96–114, 1998.
C. Beeri and T. Milo. Schemas for integration and translation of structured and semi-structured data. In International Conference on Database Theory (ICDT), pages 296–313, 1999.
R. Behrens. MONTANA: Towards a web-based infrastructure to improve lecture and research in a university environment. In 2nd International Workshop on Advance Issues of E-Commerce and Web-based Information Systems, San Jose, 2000. to appear.
R. Behrens. On the complexity of standard and specialized DTD parsing. In 12. GI-Workshop “Grundlagen von Datenbanken”, 2000. to appear.
R. Behrens, C. Lecon, V. Linnemann, and O. Schmitt. MONTANA: A digital media archive for teaching (in german). Technical report, Institut für Informationssysteme, Med. Univ. Lübeck, September 1998.
R. Book, S. Even, S. Greibach, and G. Ott. Ambiguity in graphs and expressions. IEEE Transactions on Computers, 20:149–153, 1971.
J. Bosak. XML, Java and the future of the Web. Technical report, SUN, 1997.
A. Brüggemann-Klein and D. Wood. One-unambiguous regular languages. Information and Computation, 140(2):229–253, 1998.
A. Brüggemann-Klein and D. Wood. Regular tree languages over non-ranked alphabets. Working paper, 1998.
S. Cluet. Modeling and querying semi-structured data. In International Summer School on Information Extraction (SCIE), volume 1299 of Lecture Notes in Computer Science, pages 192–213. Springer-Verlag, 1997.
H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison, and M. Timmasi. Tree Automata-Techniques and Applications. unpublished book, 1997.
V. Crescenzi and G. Mecca. Grammars have exceptions. Information Systems, 23(8):539–569, 1998.
A. Deutsch, M. Fernandez, D. Florecsu, A. Levy, and D. Suciu. XML-QL: A query language for XML. In QL98-The Query Languages Workshop. W3C, 1998.
L. Faulstich, M. Spiliopoulou, and V. Linnemann. WIND-a warehouse for internet data. In British National Conference on Databases (BNCOD), pages 169–183, 1997.
F. Gécseg and M. Steinby. Handbook of Formal Languages-Beyond Words, volume 3, pages 1–61. Springer-Verlag, 1996. In G. Rozenberg and A. Salomaa, editors.
J. Hopcroft and J. Ullman. Introduction To Automata Theory, Languages and Computation. Addison-Wesley, 1979.
D. Knuth. Semantics of context-free languages. Mathematical Systems Theory, pages 127–145, 1968.
D. Maier. Database desiderata for an XML query language. In QL98-The Query Languages Workshop. W3C, 1998.
T. Milo and D. Suciu. Type inference for queries on semistructured data. In ACM Symposium on Principles of Database Systems (PODS), pages 215–236, 1999.
Homepage of the digital media archive MONTANA. http://www.passat.mesh.de:4040.
M. Murata. Data model for document transformation and assembly. In Proceedings of the Workshop on Principles of Digital Document Processing. Springer-Verlag, 1998.
F. Neven and T. Schwentick. Query automata. In ACM Symposium on Principles of Database Systems (PODS), pages 205–214, 1999.
F. Neven and J. Van den Bussche. Expressiveness of structured document query languages based on attribute grammars. In ACM Symposium on Principles of Database Systems (PODS), pages 11–17, 1998.
Y. Papakonstantinou and P. Velikhov. Enhancing semistructured data mediators with document type definitions. In International Conference on Data Engineering (ICDE), pages 136–145, 1999.
D. Suciu. Semistructured data and XML. In Proceedings of the 5th International Conference on Foundations of Data Organization and Algorithms (FODO), 1998.
The Extensible HypterText Markup Language (XHTML). W3C Recommendation, http://www.w3.org/TR/xhtml1/, 2000.
eXtensible Markup Language (XML) 1.0. W3C Recommendation, http://www.w3.org/TR/REC-xml/, 1998.
eXtensible Stylesheet Language (XSL) specification. W3C Working draft, http://www.w3.org/TR/WD-xsl/, 1998.
XSL transformations 1.0. W3C Working draft, http://www.w3.org/TR/WD-xslt/, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Behrens, R. (2000). A Grammar Based Model for XML Schema Integration. In: Lings, B., Jeffery, K. (eds) Advances in Databases. BNCOD 2000. Lecture Notes in Computer Science, vol 1832. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45033-5_13
Download citation
DOI: https://doi.org/10.1007/3-540-45033-5_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67743-7
Online ISBN: 978-3-540-45033-7
eBook Packages: Springer Book Archive