Abstract
The LCS of two rooted, ordered, and labeled trees F and G is the largest forest that can be obtained from both trees by deleting nodes. We present algorithms for computing tree LCS which exploit the sparsity inherent to the tree LCS problem. Assuming G is smaller than F, our first algorithm runs in time \(O(r\cdot {\rm height}(F) \cdot {\rm height}(G)\cdot \lg\lg |G|)\), where r is the number of pairs (v ∈ F, w ∈ G) such that v and w have the same label. Our second algorithm runs in time \(O(L r \lg r \cdot \lg\lg|G|)\), where L is the size of the LCS of F and G. For this algorithm we present a novel three dimensional alignment graph. Our third algorithm is intended for the constrained variant of the problem in which only nodes with zero or one children can be deleted. For this case we obtain an \(O(r h \lg \lg|G|)\) time algorithm, where h = height(F) + height(G).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abouelhoda, M.I., Ohlebusch, E.: Chaining algorithms for multiple genome comparison. J. of Discrete Algorithms 3(2-4), 321–341 (2005)
Amir, A., Hartman, T., Kapah, O., Shalom, B.R., Tsur, D.: Generalized LCS. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 50–61. Springer, Heidelberg (2007)
Apostolico, A., Guerra, C.: The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987)
Backofen, R., Hermelin, D., Landau, G.M., Weimann, O.: Normalized similarity of RNA sequences. In: Proc. 12th symposium on String Processing and Information Retrieval (SPIRE), pp. 360–369 (2005)
Backofen, R., Hermelin, D., Landau, G.M., Weimann, O.: Local alignment of RNA sequences with arbitrary scoring schemes. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 246–257. Springer, Heidelberg (2006)
Bille, P.: A survey on tree edit distance and related problems. Theoretical computer science 337, 217–239 (2005)
Bille, P.: Pattern Matching in Trees and Strings. PhD thesis, ITU University of Copenhagen (2007)
Chawathe, S.: Comparing hierarchical data in external memory. In: Proc. 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, U.K, pp. 90–101 (1999)
Chen, W.: New algorithm for ordered tree-to-tree correction problem. J. of Algorithms 40, 135–158 (2001)
Chin, F.Y.L., Poon, C.K.: A fast algorithm for computing longest common subsequences of small alphabet size. J. of Information Processing 13(4), 463–469 (1990)
Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J. on Computing 32, 1654–1673 (2003)
Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 146–157. Springer, Heidelberg (2007)
Eppstein, D., Galil, Z., Giancarlo, R., Italiano, G.F.: Sparse dynamic programming i: linear cost functions. J. of the ACM 39(3), 519–545 (1992)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. of Computing 13(2), 338–355 (1984)
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Com. ACM 18(6), 341–343 (1975)
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. J. of the ACM 24(4), 664–675 (1977)
Hsu, W.J., Du, M.W.: New algorithms for the LCS problem. J. of Computer and System Sciences 29(2), 133–152 (1984)
Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. Commun. ACM 20(5), 350–353 (1977)
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)
Klein, P.N., Tirthapura, S., Sharvit, D., Kimia, B.B.: A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proc. 11th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 696–704 (2000)
Levenstein, V.I.: Binary codes capable of correcting insetrions and reversals. Sov. Phys. Dokl. 10, 707–719 (1966)
Lozano, A., Valiente, G.: On the maximum common embedded subtree problem for ordered trees. In: Iliopoulos, C.S., Lecroq, T. (eds.) String Algorithmics, pp. 155–170. King’s College Publications (2004)
Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. of Computer and System Sciences 20(1), 18–31 (1980)
Myers, G., Miller, W.: Chaining multiple-alignment fragments in sub-quadratic time. In: Proc. 6th annual ACM-SIAM symposium on Discrete algorithms (SODA), pp. 38–47 (1995)
Rick, C.: Simple and fast linear space computation of longest common subsequences. Information Processing Letters 75(6), 275–281 (2000)
Tai, K.: The tree-to-tree correction problem. J. of the ACM 26(3), 422–433 (1979)
Touzet, H.: A linear tree edit distance algorithm for similar ordered trees. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 334–345. Springer, Heidelberg (2005)
van Emde Boas, P.: Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters 6(3), 80–82 (1977)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. of the ACM 21(1), 168–173 (1974)
Zhang, K.: Algorithms for the constrained editing distance between ordered labeled trees and related problems. Pattern Recognition 28(3), 463–474 (1995)
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. of Computing 18(6), 1245–1262 (1989)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mozes, S., Tsur, D., Weimann, O., Ziv-Ukelson, M. (2008). Fast Algorithms for Computing Tree LCS. In: Ferragina, P., Landau, G.M. (eds) Combinatorial Pattern Matching. CPM 2008. Lecture Notes in Computer Science, vol 5029. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69068-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-69068-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69066-5
Online ISBN: 978-3-540-69068-9
eBook Packages: Computer ScienceComputer Science (R0)