Abstract
Having a database design that avoids redundant information and update anomalies is the main goal of normalization techniques. Ideally, data as well as constraints should be preserved. However, this is not always achievable: while BCNF eliminates all redundancies, it may not preserve constraints, and 3NF, which achieves dependency preservation, may not always eliminate all redundancies.
Our first goal is to investigate how much redundancy 3NF tolerates in order to achieve dependency preservation. We apply an information-theoretic measure and show that only prime attributes admit redundant information in 3NF, but their information content may be arbitrarily low.
Then we study the possibility of achieving both redundancy elimination and dependency preservation by a hierarchical representation of relational data in XML. We provide a characterization of cases when an XML normal form called XNF guarantees both.
Finally, we deal with dependency preservation in XML and show that like in the relational case, normalizing XML documents to achieve non-redundant data can result in losing constraints. By modifying the definition of XNF, we define another normal form for XML documents, X3NF, that generalizes 3NF for the case of XML and achieves dependency preservation.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)
Arenas, M., Fan, W., Libkin, L.: On Verifying Consistency of XML Specifications. In: PODS 2002, pp. 259–270 (2002)
Arenas, M., Libkin, L.: A Normal Form for XML Documents. In: PODS 2002, pp. 85–96 (2002)
Arenas, M., Libkin, L.: An Information-Theoretic Approach to Normal Forms for Relational and XML Data. J. ACM 52(2), 246–283 (2005)
Beeri, C., Bernstein, P.A., Goodman, N.: A Sophisticate’s Introduction to Database Normalization Theory. In: VLDB 1978, pp. 113–124 (1978)
Buneman, P., Davidson, S., Fan, W., Hara, C.S., Tan, W.: Reasoning about Keys for XML. Inf. Syst. 28(8), 1037–1063 (2003)
Buneman, P., Davidson, S.B., Fan, W., Hara, C.S., Tan, W.: Keys for XML. In: WWW 2001, pp. 201–210 (2001)
Chen, Y., Davidson, S., Hara, C., Zheng, Y.: RRXS: Redundancy Reducing XML Storage in Relations. In: VLDB 2003, pp. 189–200 (2003)
Embley, D.W., Mok, W.Y.: Developing XML Documents with Guaranteed “Good” Properties. In: Kunii, H.S., Jajodia, S., Sølvberg, A. (eds.) ER 2001. LNCS, vol. 2224, pp. 426–441. Springer, Heidelberg (2001)
Fan, W., Kuper, G.M., Siméon, J.: A Unified Constraint Model for XML. In: WWW 2001, pp. 179–190 (2001)
Fan, W., Libkin, L.: On XML Integrity Constraints in the Presence of DTDs. In: PODS 2001, pp. 114–125 (2001)
Fan, W., Siméon, J.: Integrity Constraints for XML. In: PODS 2000, pp. 23–34 (2000)
Hartmann, S., Link, S.: More Functional Dependencies for XML. In: Kalinichenko, L.A., Manthey, R., Thalheim, B., Wloka, U. (eds.) ADBIS 2003. LNCS, vol. 2798, pp. 355–369. Springer, Heidelberg (2003)
Lee, D., Chu, W.W.: Constraints-Preserving Transformation from XML Document Type Definition to Relational Schema. In: Laender, A.H.F., Liddle, S.W., Storey, V.C. (eds.) ER 2000. LNCS, vol. 1920, pp. 323–338. Springer, Heidelberg (2000)
Lee, M., Wang Ling, T., Lup Low, W.: Designing Functional Dependencies for XML. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 124–141. Springer, Heidelberg (2002)
Vincent, M., Liu, J.: Functional Dependencies for XML. In: Zhou, X., Zhang, Y., Orlowska, M.E. (eds.) APWeb 2003. LNCS, vol. 2642, pp. 22–34. Springer, Heidelberg (2003)
Vincent, M.W., Liu, J., Liu, C.: Redundancy Free Mappings from Relations to XML. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 346–356. Springer, Heidelberg (2004)
Vincent, M.W., Liu, J., Liu, C.: Strong Functional Dependencies and Their Application to Normal Forms in XML. ACM TODS 29(3), 445–462 (2004)
Wang, J., Topor, R.W.: Removing XML Data Redundancies Using Functional and Equality-Generating Dependencies. In: Australian Database Conference, pp. 65–74 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kolahi, S. (2005). Dependency-Preserving Normalization of Relational and XML Data. In: Bierman, G., Koch, C. (eds) Database Programming Languages. DBPL 2005. Lecture Notes in Computer Science, vol 3774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11601524_16
Download citation
DOI: https://doi.org/10.1007/11601524_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30951-2
Online ISBN: 978-3-540-31445-5
eBook Packages: Computer ScienceComputer Science (R0)