Abstract
A new and simple method of indexing a tree for tree patterns is presented. A tree pattern is a tree whose leaves can be labelled by a special symbol S, which serves as a placeholder for any subtree. Given a subject tree T with n nodes, the tree is preprocessed and an index, which consists of a standard string compact suffix automaton and a subtree jump table, is constructed. The number of distinct tree patterns which match the tree is \(\mathcal{O}(2^n)\), and the size of the index is \(\mathcal{O}(n)\). The searching phase uses the index, reads an input tree pattern P of size m and computes the list of positions of all occurrences of the pattern P in the tree T. For an input tree pattern P in linear prefix notation pref(P) = P 1 S P 2 S …S P k , k ≥ 1, the searching is performed in time \(\mathcal{O}(m + \sum\limits_{i=1}^k |occ(P_i)|))\), where occ(P i ) is the set of all occurrences of P i in pref(T).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A.V., Ullman, J.D.: The theory of parsing, translation, and compiling. Prentice-Hall Englewood Cliffs, N.J (1972)
Bille, P.: Pattern Matching in Trees and Strings. PhD thesis.FIT University of Copenhagen, Copenhagen (2008)
Bille, P., Gørtz, I.L., Vildhøj, H.W., Vind, S.: String indexing for patterns with wildcards. In: Fomin, F.V., Kaski, P. (eds.) SWAT 2012. LNCS, vol. 7357, pp. 283–294. Springer, Heidelberg (2012)
Bille, P., Li Gørtz, I., Vildhøj, H.W., Wind, D.K.: String matching with variable length gaps. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 385–394. Springer, Heidelberg (2010)
Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.I.: The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci. 40, 31–55 (1985)
Cleophas, L.: Tree Algorithms. Two Taxonomies and a Toolkit. PhD thesis.Technische Universiteit Eindhoven, Eindhoven (2008)
Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (release November 18, 2008), http://www.grappa.univ-lille3.fr/tata
Crochemore, M.: Transducers and repetitions. Theor. Comput. Sci. 45(1), 63–86 (1986)
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on strings. Cambridge Univ. Pr. (2007)
Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press (1994)
Crochemore, M., Rytter, W.: Jewels of stringology. World Scientific (2002)
Crochemore, M., Vérin, R.: Direct construction of compact directed acyclic word graphs. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 116–129. Springer, Heidelberg (1997)
Ehrenfeucht, A., McConnell, R.M., Osheim, N., Woo, S.-W.: Position heaps: A simple and dynamic text indexing data structure. J. Discrete Algorithms 9(1), 100–121 (2011)
Hoffmann, C.M., O’Donnell, M.J.: Pattern matching in trees. J. ACM 29(1), 68–95 (1982)
Janoušek, J.: Tree indexing by deteministic automata. Dagstuhl reports 3(5), 6 (2013)
Lewenstein, M.: Indexing with gaps. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 135–143. Springer, Heidelberg (2011)
Melichar, B., Janoušek, J., Flouri, T.: Arbology: Trees and pushdown automata. Kybernetika 48(3), 402–428 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Janoušek, J., Melichar, B., Polách, R., Poliak, M., Trávníček, J. (2014). A Full and Linear Index of a Tree for Tree Patterns. In: Jürgensen, H., Karhumäki, J., Okhotin, A. (eds) Descriptional Complexity of Formal Systems. DCFS 2014. Lecture Notes in Computer Science, vol 8614. Springer, Cham. https://doi.org/10.1007/978-3-319-09704-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-09704-6_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09703-9
Online ISBN: 978-3-319-09704-6
eBook Packages: Computer ScienceComputer Science (R0)