Skip to main content
Log in

A metric normalization of tree edit distance

  • Research Article
  • Published:
Frontiers of Computer Science in China Aims and scope Submit manuscript

Abstract

Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhang K, Shasha D, Wang J T L. Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms, 1994, 16(1): 33–66

    Article  MathSciNet  MATH  Google Scholar 

  2. Tai K C. The tree-to-tree correction problem. Journal of the ACM, 1979, 26(3): 422–433

    Article  MathSciNet  MATH  Google Scholar 

  3. Zhang K, Shasha D. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 1989, 18(6): 1245–1262

    Article  MathSciNet  MATH  Google Scholar 

  4. Kilpeläinen P, Mannila H. Ordered and unordered tree inclusion. SIAM Journal on Computing, 1995, 24(2): 340–356

    Article  MathSciNet  MATH  Google Scholar 

  5. Klein P, Tirthapura S, Sharvit D, Kimia B. A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms. 2000, 696–704

  6. Hoffmann C M, O’Donnell M J. Pattern matching in trees. Journal of the ACM, 1982, 29(1): 68–95

    Article  MathSciNet  MATH  Google Scholar 

  7. Ramesh R, Ramakrishnan I V. Nonlinear pattern matching in trees. Journal of the ACM, 1992, 39(2): 295–316

    Article  MathSciNet  MATH  Google Scholar 

  8. Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science, 2005, 337(1–3): 217–239

    Article  MathSciNet  MATH  Google Scholar 

  9. Levenshtein A. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 1966, 10(8): 707–710

    MathSciNet  Google Scholar 

  10. Sellers P H. On the theory and computation of evolutionary distances. SIAM Journal of Applied Mathematics, 1974, 26(4): 787–793

    Article  MathSciNet  MATH  Google Scholar 

  11. Wagner R A, Fischer M J. The string-to-string correction problem. Journal of the ACM, 1974, 21(1): 168–173

    Article  MathSciNet  MATH  Google Scholar 

  12. Weigel A, Fein F. Normalizing the weighted edit distance. In: Proceedings of 12th International Conference on Pattern Recognition. 1994, 399–402

  13. Yianilos P N. Normalized Forms for Two Common Metrics. Report 91-082-9027-1, revision 7. July 2002, NEC Research Inst. http://www.pnylab.com/pny/

  14. Rico-Juan J R, Mico L. Comparison of AESA and LAESA search algorithms using string and tree-edit-distance. Pattern Recognition Letters, 2003, 24(9–10): 1417–1426

    Article  MATH  Google Scholar 

  15. Yujian L, Bo L. A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 1091–1095

    Article  Google Scholar 

  16. Marzal A, Vidal E. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(9): 926–932

    Article  Google Scholar 

  17. Schroeder M, Schweimeier R. Argument and misunderstandings: fuzzy unification for negotiating agents. Electronic Notes in Theoretical Computer Science, 2002, 70(5): 1–19

    Article  Google Scholar 

  18. Vidal E. New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA). Pattern Recognition Letters, 1994, 15(1): 1–7

    Article  MathSciNet  Google Scholar 

  19. Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324

    Article  Google Scholar 

  20. Carrasco R C, Forcada ML. A note on the Nagen-draprasad-Wang-Gupta thining algorithm. Pattern Recognition Letters, 1995, 16(5): 539–541

    Article  MATH  Google Scholar 

  21. Kong L B, Tang S W, Yang D Q, Wang T J, Gao J. Querying Techniques for XML Data. Journal of Software, 2007, 18(6): 1400–1418 (in Chinese)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yujian Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Chenguang, Z. A metric normalization of tree edit distance. Front. Comput. Sci. China 5, 119–125 (2011). https://doi.org/10.1007/s11704-011-9336-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-011-9336-2

Keywords

Navigation