Skip to main content

Fast Computation of the Tree Edit Distance between Unordered Trees Using IP Solvers

  • Conference paper
Discovery Science (DS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8777))

Included in the following conference series:

Abstract

We propose a new method for computing the tree edit distance between two unordered trees by problem encoding. Our method transforms an instance of the computation into an instance of some IP problems and solves it by an efficient IP solver. The tree edit distance is defined as the minimum cost of a sequence of edit operations (either substitution, deletion, or insertion) to transform a tree into another one. Although its time complexity is NP-hard, some encoding techniques have been proposed for computational efficiency. An example is an encoding method using the clique problem. As a new encoding method, we propose to use IP solvers and provide new IP formulations representing the problem of finding the minimum cost mapping between two unordered trees, where the minimum cost exactly coincides with the tree edit distance. There are IP solvers other than that for the clique problem and our method can efficiently compute ariations of the tree edit distance by adding additional constraints. Our experimental results with Glycan datasets and the Web log datasets CSLOGS show that our method is much faster than an existing method if input trees have a large degree. We also show that two variations of the tree edit distance could be computed efficiently by IP solvers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Achterberg, T.: Scip: Solving constraint integer programs. Mathematical Programming Computation 1(1), 1–41 (2009), http://mpc.zib.de/index.php/MPC/article/view/4

    Article  MathSciNet  MATH  Google Scholar 

  2. Akutsu, T., Tamura, T., Fukagawa, D., Takasu, A.: Efficient exponential time algorithms for edit distance between unordered trees. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 360–372. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  3. Bixby, E.R., Fenelon, M., Gu, Z., Rothberg, E., Wunderling, R.: MIP: theory and practice-closing the gap. In: Powell, M.J.D., Scholtes, S. (eds.) System Modelling and Optimization: Methods, Theory, and Applications. IFIP, vol. 46, pp. 19–49. Springer, Boston (2000)

    Chapter  Google Scholar 

  4. Bixby, R.E., Fenelon, M., Gu, Z., Rothberg, E., Wunderling, R.: Mixed integer programming: a progress report. In: The Sharpest Cut: The Impact of Manfred Padberg and His Work. MPS-SIAM Series on Optimization, vol. 4, pp. 309–326 (2004)

    Google Scholar 

  5. Daiji, F., Takeyuki, T., Atushiro, T., Etsuji, T., Tatsuya, A.: A clique-based method for the edit distance between unordered trees and its application to analysis of glycan structures. BMC Bioinformatics 12 (2011)

    Google Scholar 

  6. Griva, I., Nash, S.G., Sofer, A.: Linear and Nonlinear Optimization, 2nd edn. Society for Industrial Mathematics (2008)

    Google Scholar 

  7. Higuchi, S., Kan, T., Yamamoto, Y., Hirata, K.: An A* algorithm for computing edit distance between rooted labeled unordered trees. In: Okumura, M., Bekki, D., Satoh, K. (eds.) JSAI-isAI 2012. LNCS, vol. 7258, pp. 186–196. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Horesh, Y., Mehr, R., Unger, R.: Designing an A* algorithm for calculating edit distance between rooted-unordered trees. Journal of Computational Biology 13(6), 1165–1176 (2006)

    Article  MathSciNet  Google Scholar 

  9. IBM: IBM ILOG CPLEX Optimizer (2010), http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/

  10. Jiang, T., Lin, G., Ma, B., Zhang, K.: A general edit distance between rna structures. Journal of Computational Biology 9(2), 371–388 (2002)

    Article  Google Scholar 

  11. Kan, T., Higuchi, S., Hirata, K.: Segmental mapping and distance for rooted labeled ordered trees. In: Chao, K.-M., Hsu, T.-S., Lee, D.-T. (eds.) ISAAC 2012. LNCS, vol. 7676, pp. 485–494. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Kanehisa, M., Goto, S.: Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Research 28(1), 27–30 (2000)

    Article  Google Scholar 

  13. Kuboyama, T.: Matching and learning in trees. Ph.D Thesis (The University of Tokyo) (2007)

    Google Scholar 

  14. Mori, T., Tamura, T., Fukagawa, D., Takasu, A., Tomita, E., Akutsu, T.: An improved clique-based method for computing edit distance between rooted unordered trees. SIG-BIO 2011(3), 1–6 (2011)

    Google Scholar 

  15. Shasha, D., Wang, J.L., Zhang, K., Shih, F.Y.: Exact and approximate algorithms for unordered tree matching. IEEE Transactions on Systems, Man and Cybernetics 24(4), 668–678 (1994)

    Article  MathSciNet  Google Scholar 

  16. Tai, K.C.: The tree-to-tree correction problem. Journal of the ACM (JACM) 26(3), 422–433 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  17. Valiente, G.: An efficient bottom-up distance between trees. In: Proceedings of the 8th International Symposium of String Processing and Information Retrieval, pp. 212–219. Press (2001)

    Google Scholar 

  18. Zaki, M.J.: Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE Transactions on Knowledge and Data Engineering 17(8), 1021–1035 (2005)

    Article  Google Scholar 

  19. Zhang, K., Statman, R., Shasha, D.: On the editing distance between unordered labeled trees. Information Processing Letters 42(3), 133–139 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  20. Zhang, K., Shasha, D., Wang, J.T.L.: Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms 16(1), 33–66 (1994)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kondo, S., Otaki, K., Ikeda, M., Yamamoto, A. (2014). Fast Computation of the Tree Edit Distance between Unordered Trees Using IP Solvers. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11812-3_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11811-6

  • Online ISBN: 978-3-319-11812-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics