Abstract
Semistructured data is becoming increasingly important for web applications with the development of XML and related technologies. Designing a “good” semistructured database is crucial to prevent data redundancy, inconsistency and undesirable updating anomalies. However, unlike relational databases, there is no normalization theory to facilitate the design of good semistructured databases. In this paper, we introduce the notion of a semistructured schema and identify the various anomalies that may occur in such a schema. A Normal Form for Semistructured Schemata, NF-SS, is proposed. A semistructured schema in NF-SS guarantees minimal redundancy and hence no undesirable updating anomalies for the associated semistructured databases. Furthermore, a semistructured schema in NF-SS gives a more reasonable representation of real world semantics. We develop an iterative algorithm based on a set of heuristic rules to restructure a semistructured schema into a normal form. These design methods also provide insights into the normalization task for semistructured databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, R. Hull and V. Vianu. Foundations of Databases. Addison-Wesley, 1995
T. Bray, J. Paoli, and C. M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. 2nd Edition, Oct. 2000. http://www.w3.org/TR/REC-xml.
P. Buneman, S. Davidson, W. Fan, C. Hara and W. Tan. Keys for XML. Proceedings of the 10th International World Wide Web Conference, 2001.
J. Clark and S. DeRose. XML Path Language (XPath). W3C Working Darft, November 1999. http://www.w3.org/TR/xpath.
D.W. Embley and W.Y. Mok. Developing XML Documents with Guaranteed “Good” Properties. Proceedings of the 20th International Conference on Conceptual Modeling (ER), 2001.
S. Y. Lee, M. L. Lee, T. W. Ling and L. A.. Kalinichenko. Designing Good Semistructured Databases. Proceedings of the 18th International Conference on Conceptual Modeling (ER), 1999.
T. W. Ling and L. L. Yan. NF-NR: A Practical Normal Form for Nested Relations. Journal of Systems Integration. Vol4, 1994, pp309–340
Z. M. Ozsoyoglu and L. Y. Yuan. A New Normal Form for Nested Relations. ACM Transaction on Database Systems. 12(1), (1987).
R. Ramakrishman and J. Gehrke. Database Management Systems. McGraw-Hill Higher Education, 2000.
Xiaoying Wu. Designing Good Semistructured Databases. Master Thesis, School of Computing, National University of Singapore, 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, X., Ling, T.W., Sin Yeung, L., Lee, M.L., Dobbie, G. (2002). NF-SS: A Normal Form for Semistructured Schema. In: Arisawa, H., Kambayashi, Y., Kumar, V., Mayr, H.C., Hunt, I. (eds) Conceptual Modeling for New Information Systems Technologies. ER 2001. Lecture Notes in Computer Science, vol 2465. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46140-X_23
Download citation
DOI: https://doi.org/10.1007/3-540-46140-X_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44122-9
Online ISBN: 978-3-540-46140-1
eBook Packages: Springer Book Archive