Abstract
The edit distance problem on two unordered trees is known to be MAX SNP-hard. In this paper, we present an approximation algorithm whose approximation ratio is 2h + 2, where we consider unit cost edit operations and h is the maximum height of the two input trees. The algorithm is based on an embedding of unit cost tree edit distance into L 1 distance. We also present an efficient implementation of the algorithm using randomized dimension reduction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston (1974)
Akutsu, T.: A relation between edit distance for ordered trees and edit distance for euler strings. Inf. Proc. Lett. 100, 105–109 (2006)
Akutsu, T., Fukagawa, D., Takasu, A.: Improved approximation of the largest common sub-tree of two unordered trees of bounded height. Inf. Proc. Lett. 109, 165–170 (2008)
Bille, P.: A survey on tree edit distance and related problems. Theor. Comput. Sci. 337, 217–239 (2005)
Collins, M., Duffy, N.: Convolution Kernels for Natural Language. In: Proc. NIPS, pp. 625–632 (2001)
Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 146–157. Springer, Heidelberg (2007)
Garofalakis, M.N., Kumar, A.: XML stream processing using tree-edit distance embeddings. ACM Trans. Database System 30, 279–332 (2005)
Guha, S., Jagadish, H.V., Koudas, N., Srivastava, D., Yu, T.: Approximate XML joins. In: SIGMOD 2002 (2002)
Halldórsson, M.M., Tanaka, K.: Approximation and special cases of common subtrees and editing distance. In: Nagamochi, H., Suri, S., Igarashi, Y., Miyano, S., Asano, T. (eds.) ISAAC 1996. LNCS, vol. 1178, pp. 75–84. Springer, Heidelberg (1996)
Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31, 249–260 (1987)
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)
Knuth, D.E.: The Art of Computer Programming. Fascicle 4: Generating All Trees, vol. 4. Addison-Wesley Professional, Reading (2006)
Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press, New York (1995)
Müller-Molina, A.J., Hirata, K., Shinohara, T.: A tree distance function based on multi-sets. In: ALSIP 2008, PAKDD Workshops, pp. 90–100 (2008)
Tai, K.-C.: The tree-to-tree correction problem. J. ACM 26, 422–433 (1979)
Valiente, G.: An Efficient Bottom-Up Distance between Trees. In: Proc. Eighth Int’l Symp. String Processing Information Retrieval, pp. 212–219 (2001)
Vishwanathan, S.V.N., Smola, A.J.: Fast Kernels for String and Tree Matching. In: NIPS, pp. 569–576 (2002)
Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: SIGMOD (2005)
Zhang, K.: A constrained edit distance between unordered labeled trees. Algorithmica 15, 205–222 (1996)
Zhang, K., Jiang, T.: Some MAX SNP-hard results concerning unordered labeled trees. Inf. Proc. Lett. 49, 249–254 (1994)
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Computing 18, 1245–1262 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fukagawa, D., Akutsu, T., Takasu, A. (2009). Constant Factor Approximation of Edit Distance of Bounded Height Unordered Trees. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds) String Processing and Information Retrieval. SPIRE 2009. Lecture Notes in Computer Science, vol 5721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03784-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-03784-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03783-2
Online ISBN: 978-3-642-03784-9
eBook Packages: Computer ScienceComputer Science (R0)