Skip to main content

Constant Factor Approximation of Edit Distance of Bounded Height Unordered Trees

  • Conference paper
String Processing and Information Retrieval (SPIRE 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5721))

Included in the following conference series:

Abstract

The edit distance problem on two unordered trees is known to be MAX SNP-hard. In this paper, we present an approximation algorithm whose approximation ratio is 2h + 2, where we consider unit cost edit operations and h is the maximum height of the two input trees. The algorithm is based on an embedding of unit cost tree edit distance into L 1 distance. We also present an efficient implementation of the algorithm using randomized dimension reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston (1974)

    MATH  Google Scholar 

  2. Akutsu, T.: A relation between edit distance for ordered trees and edit distance for euler strings. Inf. Proc. Lett. 100, 105–109 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  3. Akutsu, T., Fukagawa, D., Takasu, A.: Improved approximation of the largest common sub-tree of two unordered trees of bounded height. Inf. Proc. Lett. 109, 165–170 (2008)

    Article  MATH  Google Scholar 

  4. Bille, P.: A survey on tree edit distance and related problems. Theor. Comput. Sci. 337, 217–239 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  5. Collins, M., Duffy, N.: Convolution Kernels for Natural Language. In: Proc. NIPS, pp. 625–632 (2001)

    Google Scholar 

  6. Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 146–157. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  7. Garofalakis, M.N., Kumar, A.: XML stream processing using tree-edit distance embeddings. ACM Trans. Database System 30, 279–332 (2005)

    Article  Google Scholar 

  8. Guha, S., Jagadish, H.V., Koudas, N., Srivastava, D., Yu, T.: Approximate XML joins. In: SIGMOD 2002 (2002)

    Google Scholar 

  9. Halldórsson, M.M., Tanaka, K.: Approximation and special cases of common subtrees and editing distance. In: Nagamochi, H., Suri, S., Igarashi, Y., Miyano, S., Asano, T. (eds.) ISAAC 1996. LNCS, vol. 1178, pp. 75–84. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  10. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31, 249–260 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  11. Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)

    Google Scholar 

  12. Knuth, D.E.: The Art of Computer Programming. Fascicle 4: Generating All Trees, vol. 4. Addison-Wesley Professional, Reading (2006)

    Google Scholar 

  13. Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press, New York (1995)

    Book  MATH  Google Scholar 

  14. Müller-Molina, A.J., Hirata, K., Shinohara, T.: A tree distance function based on multi-sets. In: ALSIP 2008, PAKDD Workshops, pp. 90–100 (2008)

    Google Scholar 

  15. Tai, K.-C.: The tree-to-tree correction problem. J. ACM 26, 422–433 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  16. Valiente, G.: An Efficient Bottom-Up Distance between Trees. In: Proc. Eighth Int’l Symp. String Processing Information Retrieval, pp. 212–219 (2001)

    Google Scholar 

  17. Vishwanathan, S.V.N., Smola, A.J.: Fast Kernels for String and Tree Matching. In: NIPS, pp. 569–576 (2002)

    Google Scholar 

  18. Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: SIGMOD (2005)

    Google Scholar 

  19. Zhang, K.: A constrained edit distance between unordered labeled trees. Algorithmica 15, 205–222 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  20. Zhang, K., Jiang, T.: Some MAX SNP-hard results concerning unordered labeled trees. Inf. Proc. Lett. 49, 249–254 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  21. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Computing 18, 1245–1262 (1989)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fukagawa, D., Akutsu, T., Takasu, A. (2009). Constant Factor Approximation of Edit Distance of Bounded Height Unordered Trees. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds) String Processing and Information Retrieval. SPIRE 2009. Lecture Notes in Computer Science, vol 5721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03784-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03784-9_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03783-2

  • Online ISBN: 978-3-642-03784-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics