Abstract
Given an ordered labeled forest F (“the target forest”) and an ordered labeled forest G (“the pattern forest”), the most similar subforest problem is to find a subforest F′ of F such that the distance between F′ and G is minimum over all possible F′. This problem generalizes several well-studied problems which have important applications in locating patterns in hierarchical structures such as RNA molecules’ secondary structures and XML documents. In this paper, we present efficient algorithms for the most similar subforest problem with forest edit distance for three types of subforests: simple substructures, sibling substructures, and closed subforests.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chen, W.: New algorithm for ordered tree-to-tree correction problem. Journal of Algorithms 40(2), 135–158 (2001)
Cobéna, G., Abiteboul, S., Marian, A.: Detecting changes in XML documents. In: Proceedings of the 18th IEEE International Conference on Data Engineering (ICDE 2002), pp. 41–52 (2002)
Crochemore, M., Rytter, W.: Text algorithms. Oxford University Press, Oxford (1994)
Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. In: Proceedings of the IEEE Computational Systems Bioinformatics Conference (CSB 2003), pp. 159–168 (2003)
Jansson, J., Lingas, A.: A fast algorithm for optimal alignment between similar ordered trees. Fundamenta Informaticae 56(1–2), 105–120 (2003)
Jansson, J., Ngo, T.H., Sung, W.-K.: Local gapped subforest alignment and its application in finding RNA structural motifs. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 569–580. Springer, Heidelberg (2004)
Jiang, T., Wang, L., Zhang, K.: Alignment of trees - an alternative to tree edit. Theoretical Computer Science 143, 137–148 (1995)
Kilpeläinen, P., Mannila, H.: Ordered and unordered tree inclusion. SIAM Journal on Computing 24(2), 340–356 (1995)
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)
Motifs database, http://subviral.med.uottawa.ca/cgi-bin/motifs.cgi
Shapiro, B.A., Zhang, K.: Comparing multiple RNA secondary structures using tree comparisons. Computer Applications in the Biosciences 6(4), 309–318 (1990)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)
Tai, K.-C.: The tree-to-tree correction problem. Journal of the ACM 26(3), 422–433 (1979)
Touzet, H.: A linear time edit distance algorithm for similar ordered trees. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 334–345. Springer, Heidelberg (2005)
Valiente, G.: Constrained tree inclusion. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 361–371. Springer, Heidelberg (2003)
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing 18(6), 1245–1262 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jansson, J., Peng, Z. (2006). Algorithms for Finding a Most Similar Subforest. In: Lewenstein, M., Valiente, G. (eds) Combinatorial Pattern Matching. CPM 2006. Lecture Notes in Computer Science, vol 4009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780441_34
Download citation
DOI: https://doi.org/10.1007/11780441_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35455-0
Online ISBN: 978-3-540-35461-1
eBook Packages: Computer ScienceComputer Science (R0)