Abstract
Automata for unranked trees form a foundation for XML schemas, querying and pattern languages. We study the problem of efficiently minimizing such automata. We start with the unranked tree automata (UTAs) that are standard in database theory, assuming bottom-up determinism and that horizontal recursion is represented by deterministic finite automata. We show that minimal UTAs in that class are not unique and that minimization is np-hard. We then study more recent automata classes that do allow for polynomial time minimization. Among those, we show that bottom-up deterministic stepwise tree automata yield the most succinct representations.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Angluin, D.: Learning regular sets from queries and counterexamples. Information and Computation 75(2), 87–106 (1987)
Brüggemann-Klein, A., Murata, M., Wood, D.: Regular tree and regular hedge languages over unranked alphabets: Version 1, April 3 (2001); Technical Report HKUST-TCSC-2001-0, The Hongkong University of Science and Technology (2001)
Carme, J., Lemay, A., Niehren, J.: Learning node selecting tree transducers from completely annotated examples. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 91–102. Springer, Heidelberg (2004)
Carme, J., Niehren, J., Tommasi, M.: Querying unranked trees with stepwise tree automata. In: van Oostrom, V. (ed.) RTA 2004. LNCS, vol. 3091, pp. 105–118. Springer, Heidelberg (2004)
Courcelle, B.: On recognizable sets and tree automata. In: Resolution of equations in algebraic structures, pp. 93–126 (1989)
Cristau, J., Löding, C., Thomas, W.: Deterministic automata on unranked trees. In: Liśkiewicz, M., Reischuk, R. (eds.) FCT 2005. LNCS, vol. 3623, pp. 68–79. Springer, Heidelberg (2005)
Frick, M., Grohe, M., Koch, C.: Query evaluation on compressed trees (extended abstract). In: LICS 2003, pp. 188–197 (2003)
Gold, E.M.: Complexity of automaton identification from given data. Inform. Control 37, 302–320 (1978)
Jiang, T., Ravikumar, B.: Minimal NFA problems are hard. SIAM Journal on Computing 22(6), 1117–1141 (1993)
Kozen, D.: On the Myhill-Nerode theorem for trees. Bulletin of the European Association for Theoretical Computer Science 147, 170–173 (1992)
Malcher, A.: Minimizing finite automata is computationally hard. Theoretical Computer Science 327(3), 375–390 (2004)
Martens, W.: On minimizing finite automata with very little non-determinism. Manuscript (2005)
Martens, W., Neven, F.: Frontiers of tractability for typechecking simple XML transformations. In: PODS 2004, pp. 23–34 (2004)
Martens, W., Neven, F., Schwentick, T.: Which XML schemas admit 1-pass preorder typing? In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 68–82. Springer, Heidelberg (2004)
Martens, W., Niehren, J.: Minimizing Tree Automata for Unranked Trees. Full Version, http://www.uhasselt.be/wim.martens/pubs.html
Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML schema languages using formal language theory. ACM Transaction on Internet Technology 5(4) (2005) (to appear)
Neven, F., Schwentick, T.: Expressive and efficient pattern languages for tree-structured data. In: PODS 2000, pp. 145–156 (2000)
Neven, F., Schwentick, T.: Query automata on finite trees. Theoretical Computer Science 275, 633–674 (2002)
Oncina, J., Garcia, P.: Inferring regular languages in polynomial update time. In: Pattern Recognition and Image Analysis, pp. 49–61 (1992)
Papakonstantinou, Y., Vianu, V.: DTD inference for views of XML data. In: PODS 2000, pp. 35–46. ACM Press, New York (2000)
Raeymaekers, S., Bruynooghe, M.: Minimization of finite unranked tree automata. Manuscript (2004)
Schwentick, T.: XPath query containment. Sigmod Record 33(2), 101–109 (2004)
Seidl, H.: Deciding equivalence of finite tree automata. SIAM Journal on Computing 19(3), 424–437 (1990)
Stearns, R.E., Hunt III, H.B.: On the equivalence and containment problems for unambiguous regular expressions, regular grammars and finite automata. SIAM Journal on Computing 14(3), 598–611 (1985)
Stockmeyer, L.J., Meyer, A.R.: Word problems requiring exponential time: Preliminary report. In: STOC 1973, pp. 1–9 (1973)
Suciu, D.: Typechecking for semistructured data. In: Ghelli, G., Grahne, G. (eds.) DBPL 2001. LNCS, vol. 2397, pp. 1–20. Springer, Heidelberg (2002)
Thatcher, J.W., Wright, J.B.: Generalized finite automata theory with an application to a decision problem of second-order logic. Mathematical Systems Theory 2(1), 57–81 (1968)
World Wide Web Consortium. XML Schema, http://www.w3.org/XML/Schema
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martens, W., Niehren, J. (2005). Minimizing Tree Automata for Unranked Trees. In: Bierman, G., Koch, C. (eds) Database Programming Languages. DBPL 2005. Lecture Notes in Computer Science, vol 3774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11601524_15
Download citation
DOI: https://doi.org/10.1007/11601524_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30951-2
Online ISBN: 978-3-540-31445-5
eBook Packages: Computer ScienceComputer Science (R0)