Abstract
XML elements are described by XML schema languages such as a DTD or an XML Schema definition. The instances of these elements are semi-structured tuples. We may think of a semi-structure tuple as a sentence of a formal language, where the values are the terminal symbols and the attribute names are the nonterminal symbols. In our former work [13] we introduced the notion of the extended tuple as a sentence from a regular language generated by a grammar where the nonterminal symbols of the grammar are the attribute names of the tuple. Sets of extended tuples are the extended relations. We then introduced the dual language, which generates the tuple types allowed to occur in extended relations. We defined functional dependencies (regular FD - RFD) over extended relations. In this paper we rephrase the RFD concept by directly using regular expressions over attribute names to define extended tuples. By the help of a special vertex labeled graph associated to regular expressions the specification of substring selection for the projection operation can be defined. The normalization for regular schemas is more complex than it is in the relational model, because the schema of an extended relation can contain an infinite number of tuple types. However, we can define selection, projection and join operations on extended relations too, so a lossless-join decomposition can be performed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)
Arenas, M., Libkin, L.: A normal form for XML documents. ACM Transactions on Database Systems 29(1), 195–232 (2004)
Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theoretical Computer Science 48(3), 117–126 (1986)
Bouyer, P., Petit, A., Thrien, D.: An algebraic approach to data languages and timed languages. Information and Computation 182(2), 137–162 (2003)
Brzozowski, J.A.: Derivatives of regular expressions. Journal of the ACM 11(4), 481–494 (1964)
Champarnaud, J.-M., Ziadi, D.: Canonical derivatives, partial derivatives and finite automaton constructions. Theoretical Computer Science 289(1), 137–163 (2002)
Glushkov, V.M.: The abstract theory of automata. Russian Mathematical Surveys 16, 1–53 (1961)
Kaminski, M., Francez, N.: Finite-memory automata. Theoretical Computer Science 134(2), 329–363 (1994)
Libkin, L., Vrgoč, D.: Regular expressions for data words. In: Bjørner, N., Voronkov, A. (eds.) LPAR-18. LNCS, vol. 7180, pp. 274–288. Springer, Heidelberg (2012)
Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML schema languages using formal language theory. ACM Transactions on Internet Technology 5(4), 660–704 (2005)
Nicaud, C., Pivoteau, C., Razet, B.: Average Analysis of Glushkov Automata under a BST-Like Model. In: Proc. FSTTCS, pp. 388–399 (2010)
Sperberg-McQueen, C.M., Thompson, H.: XML Schema. Technical report, World Wide Web Consortium (2005), http://www.w3.org/XML/Schema
Szabó, G. I., Benczúr, A.: Functional Dependencies on Extended Relations Defined by Regular Languages. Annals of Mathematics and Artificial Intelligence (2013), doi: 10.1007/s10472-013-9352-z
Vincent, M.W., Liu, J., Liu, C.: Strong functional dependencies and their application to normal forms in XML. ACM Transactions on Database Systems 29(3), 445–462 (2004)
Wang, J., Topor, R.W.: Removing XML Data Redundancies Using Functional and Equality-Generating Dependencies. In: Proc. ADC, pp. 65–74 (2005)
Watson, B.W.: A taxonomy of finite automata construction algorithms. Computing Science Note 93/43, Eindhoven University of Technology, The Netherlands (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Benczúr, A., Szabó, G.I. (2014). Towards a Normal Form for Extended Relations Defined by Regular Expressions. In: Manolopoulos, Y., Trajcevski, G., Kon-Popovska, M. (eds) Advances in Databases and Information Systems. ADBIS 2014. Lecture Notes in Computer Science, vol 8716. Springer, Cham. https://doi.org/10.1007/978-3-319-10933-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-10933-6_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10932-9
Online ISBN: 978-3-319-10933-6
eBook Packages: Computer ScienceComputer Science (R0)