Abstract
With the growing use of the eXtensible Markup Language (XML) in database technology as a format for the permanent storage of data, the topic functional dependencies in XML (XFDs) has assumed increased importance because of its central role in database design. Recently, two different approaches have been proposed for defining an XFD. The first uses the concept of a ‘tree tuple’, whereas the second uses the concept of a ‘closest node’. In general, the two approaches are not comparable, but are comparable when a Document Type Definition is present and there is no missing information in the XML document. The first contribution of this article shows that when the two XFD definitions are comparable, the definitions are equivalent, and so there is essentially a common definition of an XFD in complete XML documents. The second contribution is to provide justification for the definition of a ‘closest node’ XFD. We show that if a complete flat relation is mapped to an XML document by an arbitrary sequence of nest operations, the XML document satisfies a ‘closest node’ XFD if and only if the relation satisfies the corresponding functional dependency. The class of XML documents generated in this fashion is a subset of the class of XML documents for which the two definitions of XFDs coincide. Hence ‘tree tuple’ and ‘closest node’ XFDs both capture the semantics of FDs when a complete relation is mapped to an XML document via arbitrary nesting.
Similar content being viewed by others
References
Abiteboul S., Hull R. and Vianu V. (1996). Foundations of databases. Addison–Wesley, Reading
Arenas, M., Barcelo, P., Fagin, R., Libkin, L.: Locally consistent transformations and query answering in data exchange. In: PODS, pp. 229–240 (2004)
Arenas, M., Libkin, L.: A normal form for XML documents. In: PODS, pp. 85–96 (2002)
Arenas, M., Libkin, L.: An information-theoretic approach to normal forms for relational and XML data. In: ACM Principles of Databases Conference, pp. 15–26 (2003)
Arenas M. and Libkin L. (2004). A normal form for XML documents. TODS 29(1): 195–232
Arenas M. and Libkin L. (2005). An information-theoretic approach to normal forms for relational and XML data. JACM 52(2): 246–283
Atzeni P. and DeAntonellis V. (1993). Relational Database Theory. Benjamin Cummings, Reading
Beyer, K., Cochrane, R., Josifovski, V., Kleewein, J., Lapis, G., Lohman, G.: System RX: one part relational, one part XML. In: ACM SIGMOD Conference, pp. 347–358 (2005)
Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: an XML query language (2005). http://www.w3.org/TR/2005/CR-xquery-20051103/
Bray, T., Paoli, J., Sperberg-McQueen, C.: Extensible markup language (XML) 1.0. Technical report (1998). http://www.w3.org/Tr/1998/REC-XML-19980819
Buneman P., Davidson S., Fan W., Hara C. and Tan W. (2003). Reasoning about keys for XML. Inf. Syst. 28(8): 1037–1063
Buneman, P., Fan, W., Weinstein, S.: Path constraints on structured and semistructured data. In: ACM PODS Conference, pp. 129–138 (1998)
Chen, Y., Davidson, S., Hara, C., Y.Zheng: RRXS:redundancy reducing XML storage in relations. In: VLDB, pp. 189–200 (2003)
Cover T. and Thomas J. (1991). Elements of Information Theory. Wiley, New York
Davidson, S., Fan, W., Hara, C., , Qin, J.: Propagating XML constraints to relations. In: The 19th International Conference on Data Engineering (ICDE), pp. 543–554 (2003)
Embley, D.W., Mok, W.Y.: Developing XML documents with guaranteed “good” properties. In: ER 2001, 20th International Conference on Conceptual Modeling, pp. 426 –441 (2001)
Fagin, R.: Normal forms and relational database operators. In: ACM SIGMOD Conference, pp. 123–134 (1979)
Fagin, R., Kolaitis, P., Popa, L., Tan, W.: Composing schema mappings: second-order dependencies to the rescue. In: ACM PODS Conference, pp. 83–94 (2004)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: Semantics and query answering. In: International Conference on Database Theory, pp. 207–224 (2003)
Fagin, R., Kolaitis, P.G., Popa, L.: Data exchange: getting to the core. In: ACM PODS conference, pp. 90–101 (2003)
Fan, W.: XML constraints: Specification, analysis, and applications. In: DEXA Workshops 2005, pp. 805–809 (2005)
Fan W. and Libkin L. (2002). On XML integrity constraints in the presence of DTDs. JACM 49(3): 368–406
Fan W. and Simeon J. (2003). Integrity constraints for XML. J. Comput. Syst. Sci. 66(1): 254–291
Gottlob, G., Schrefl, M., Stumptner, M.: On the interaction between transitive closure and functional dependencies. In: Second Symposium on Mathematical Fundamentals of Database Systems, pp. 187–206 (1989)
Halverson, A., Josifovski, V., Lohman, G., Pirahesh, H., Mšrschel, M.: ROX: Relational over XML. In: VLDB Conference, pp. 264–275 (2005)
Hartmann, S., T.T.: Axiomatising functional dependencies for XML with frequencies. In: FOIKS, pp. 159–178 (2006)
Hartmann S., Link S. and Schewe K.D. (2005). Functional dependencies over XML documents with DTDs. Acta Cybern. 17(1): 153–171
Klarlund, N., Schwentick, T., Suciu, D.: XML: Model, schemas, types, logics, and queries. In: Logics for Emerging Applications of Databases, pp. 1–41 (2003)
Kolahi, S.: Dependency-preserving normalization of relational and XML data. In: DBPL, pp. 247–261 (2005)
Lee, M., Ling, T., Low, W.L.: Designing functional dependencies for XML. In: EDBT Conference, pp. 124–141 (2002)
Lenzerini, M.: Data integration: a theoretical perspective. In: ACM PODS Conference, pp. 233–246 (2002)
Levene M. and Vincent M.W. (2000). Justification for inclusion dependency normal form. IEEE Trans. Knowl. Data Eng. 12: 281–291
Libkin S. (2004). Elements of Finite Model Theory. Springer, Heidelberg
Lin T.W., Lee M.M. and Dobbie G. (2004). Semistructured Database Design. Springer, Heidelberg
Miller R.J., Hernndez M.A., Haas L.M., Yan L., Ho C.T.H., Fagin R. and Popa L. (2001). The clio project: Managing heterogeneity. SIGMOD Rec. 30(1): 78–83
Moller, A., Schwartzbach, M.: Introduction to XML and Web Technologies. Addison–Wesley, Reading (2006)
Schewe, K.D.: Redundancy, dependencies and normal forms for XML databases. In: ADC, pp. 7–16 (2005)
Thomas, S., Fischer, P.: Nested relational structures. In: Kanellakis, P. (ed.) The Theory of Databases, pp. 269 –307. JAI Press, Greenwich, CT (1986)
Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N.: XML Schema Part 1: Structures (2001). W3C Working Draft, http://www.w3.org/Tr/1998/XMLschema-1
Velegrakis, Y., Miller, R.J., Mylopoulos, J.: Representing and querying data transformations. In: Proceedings of IEEE International Conference on Data Engineering, pp. 81–92 (2005)
Vincent, M., Liu, J.: Multivalued dependencies and a 4NF for XML. In: CAISE, pp. 14–29 (2003)
Vincent, M., Liu, J.: Multivalued dependencies in XML. In: BNCOD, pp. 4–18 (2003)
Vincent, M., Liu, J., Liu, C.: Multivalued dependencies and a redundancy free 4NF for XML. In: XML Symposium, pp. 254–266 (2003)
Vincent M., Liu J. and Liu C. (2004). Strong functional dependencies and their application to normal forms in XML. TODS 29(3): 445–462
Vincent, M.W.: A new redundancy free normal form for relational database design. In: Database Semantics, pp. 247–264 (1998)
Vincent M.W. (1999). Semantic foundations of 4NF in relational database design. Acta Inf. 36: 1–41
Vincent M.W. and Levene M. (2000). Restructuring partitioned normal relations without information loss. SIAM J. Comput. 39(5): 1550–1567
Wang, J., Topor, R.: Removing XML data redundancies using functional and equality-generating dependencies. In: ADC, pp. 65–74 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vincent, M.W., Liu, J. & Mohania, M. On the equivalence between FDs in XML and FDs in relations. Acta Informatica 44, 207–247 (2007). https://doi.org/10.1007/s00236-007-0048-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00236-007-0048-x