skip to main content
10.1145/3313276.3316388acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article

1+ε approximation of tree edit distance in quadratic time

Published: 23 June 2019 Publication History

Abstract

Edit distance is one of the most fundamental problems in computer science. Tree edit distance is a natural generalization of edit distance to ordered rooted trees. Such a generalization extends the applications of edit distance to areas such as computational biology, structured data analysis (e.g., XML), image analysis, and compiler optimization. Perhaps the most notable application of tree edit distance is in the analysis of RNA molecules in computational biology where the secondary structure of RNA is typically represented as a rooted tree.
The best-known solution for tree edit distance runs in cubic time. Recently, Bringmann et al. show that an O(n2.99) algorithm for weighted tree edit distance is unlikely by proving a conditional lower bound on the computational complexity of tree edit distance. This shows a substantial gap between the computational complexity of tree edit distance and that of edit distance for which a simple dynamic program solves the problem in quadratic time.
In this work, we give the first non-trivial approximation algorithms for tree edit distance. Our main result is a quadratic time approximation scheme for tree edit distance that approximates the solution within a factor of 1+є for any constant є > 0.

References

[1]
Amir Abboud. 2014. Hardness for Easy Problems. (2014).
[2]
Presented at Satellite Workshop of ICALP (YR-ICALP).
[3]
Tatsuya Akutsu, Daiji Fukagawa, and Atsuhiro Takasu. 2010. Approximating Tree Edit Distance Through String Edit Distance. Algorithmica 57, 2 (2010), 325–348.
[4]
Stephen Alstrup, Thore Husfeldt, and Theis Rauhe. 1998.
[5]
Marked Ancestor Problems. In FOCS. IEEE, 534–543.
[6]
Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. 2010. Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity. In FOCS. IEEE, 377–386.
[7]
Alexandr Andoni and Krzysztof Onak. 2009.
[8]
Approximating Edit Distance in Near-linear Time. In STOC. ACM, 199–204.
[9]
Arturs Backurs, Piotr Indyk, and Ludwig Schmidt. 2017. Better Approximations for Tree Sparsity in Nearly-Linear Time. In SODA. SIAM, 2215–2229.
[10]
Ziv Bar-Yossef, TS Jayram, Robert Krauthgamer, and Ravi Kumar. 2004. Approximating Edit Distance Efficiently. In FOCS. IEEE, 550–559.
[11]
Tuğkan Batu, Funda Ergun, and Cenk Sahinalp. 2006. Oblivious String Embeddings and Edit Distance Approximations. In SODA. SIAM, 792–801.
[12]
Philip Bille. 2005. A Survey on Tree Edit Distance and Related Problems. Theoretical Computer Science 337, 1 (2005), 217–239.
[13]
Mahdi Boroujeni, Soheil Ehsani, Mohammad Ghodsi, MohammadTaghi Haji-Aghayi, and Saeed Seddighin. 2018. Approximating Edit Distance in Truly Subquadratic Time: Quantum and MapReduce. In SODA. SIAM, 1170–1189.
[14]
Karl Bringmann, PawełGawrychowski, Shay Mozes, and Oren Weimann. 2018.
[15]
Tree Edit Distance Cannot be Computed in Strongly Subcubic Time (unless APSP can). In SODA. SIAM, 1190–1206.
[16]
Karl Bringmann, Fabrizio Grandoni, Barna Saha, and Virginia Vassilevska Williams. 2016.
[17]
Truly Sub-cubic Algorithms for Language Edit Distance and RNA-Folding via Fast Bounded-Difference Min-Plus Product. In FOCS. IEEE, 375–384.
[18]
Peter Buneman, Martin Grohe, and Christoph Koch. 2003.
[19]
Path Queries on Compressed XML. In VLDB. VLDB Endowment, 141–152.
[20]
Horst Bunke and Kim Shearer. 1998.
[21]
A Graph Distance Metric Based on the Maximal Common Subgraph. Pattern Recognition Letters 19, 3 (1998), 255–259.
[22]
Diptarka Chakraborty, Debarati Das, Elazar Goldenberg, Michal Koucky, and Michael Saks. 2018.
[23]
Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time. In FOCS. IEEE, 979–990.
[24]
Sudarshan S. Chawathe. 1999. Comparing Hierarchical Data in External Memory. In VLDB. Morgan Kaufmann Publishers Inc., 90–101.
[25]
Erik D. Demaine, Shay Mozes, Benjamin Rossman, and Oren Weimann. 2007. An Optimal Decomposition Algorithm for Tree Edit Distance. In ICALP. Springer, 146–157.
[26]
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini, and S. Muthukrishnan. 2009.
[27]
Compressing and Indexing Labeled Trees, with Applications. J. ACM 57, 1 (Nov. 2009), 4:1–4:33.
[28]
Ofer Freedman, Paweł Gawrychowski, Patrick K. Nicholson, and Oren Weimann. 2017. Optimal Distance Labeling Schemes for Trees. In PODC. ACM, 185–194.
[29]
Paweł Gawrychowski, Nadav Krasnopolsky, Shay Mozes, and Oren Weimann. 2017. Dispersion on Trees. In ESA. Dagstuhl, 40:1–40:13.
[30]
Dan Gusfield. 1997.
[31]
Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press.
[32]
Bernhard Haeupler, Aviad Rubinstein, and Amirbehshad Shahrasbi. 2019. Near-Linear Time Insertion-Deletion Codes and (1+ ε )-Approximating Edit Distance via Indexing. (2019).
[33]
In STOC. ACM.
[34]
Dov Harel and Robert E. Tarjan. 1984.
[35]
Fast Algorithms for Finding Nearest Common Ancestors. SIAM J. Comput. 13, 2 (1984), 338–355.
[36]
Piotr Indyk. 2001. Algorithmic Applications of Low-distortion Geometric Embeddings. In FOCS. IEEE, 10–33.
[37]
Rajesh Jayaram and Barna Saha. 2017. Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!. In ICALP. Dagstuhl, 19:1–19:15.
[38]
Camille Jordan. 1869. Sur les assemblages de lignes. Journal für die reine und angewandte Mathematik 70 (1869), 185–190.
[39]
Philip N. Klein. 1998. Computing the Edit-Distance Between Unrooted Ordered Trees. In ESA. Springer, 91–102.
[40]
Gad M. Landau, Eugene W. Myers, and Jeanette P. Schmidt. 1998. Incremental String Comparison. SIAM J. Comput. 27, 2 (1998), 557–582.
[41]
Gad M. Landau and Uzi Vishkin. 1986.
[42]
Introducing Efficient Parallelism into Approximate String Matching and a New Serial Algorithm. In STOC. ACM, 220– 230.
[43]
Barna Saha. 2017.
[44]
Fast & Space-Efficient Approximations of Language Edit Distance and RNA Folding: An Amnesic Dynamic Programming Approach. In FOCS. IEEE, 295–306.
[45]
Stanley M. Selkow. 1977. The Tree-to-Tree Editing Problem. Inform. Process. Lett. 6, 6 (1977), 184–186.
[46]
Bruce A. Shapiro and Kaizhong Zhang. 1990. Comparing Multiple RNA Secondary Structures Using Tree Comparisons. Bioinformatics 6, 4 (1990), 309–318.
[47]
Dennis Shasha and Kaizhong Zhang. 1990. Fast Algorithms for the Unit Cost Editing Distance Between Trees. Journal of Algorithms 11, 4 (Dec. 1990), 581–621.
[48]
Daniel D. Sleator and Robert E. Tarjan. 1983. A Data Structure for Dynamic Trees. J. Comput. System Sci. 26, 3 (June 1983), 362–391.
[49]
Daniel D. Sleator and Robert E. Tarjan. 1985. Self-adjusting Binary Search Trees. J. ACM 32, 3 (1985), 652–686.
[50]
Kuo Chung Tai. 1979. The Tree-to-Tree Correction Problem. J. ACM 26, 3 (July 1979), 422–433.
[51]
Hélène Touzet. 2005. A Linear Tree Edit Distance Algorithm for Similar Ordered Trees. In CPM. Springer, 334–345.
[52]
Michael S. Waterman. 1995.
[53]
Introduction to Computational Biology: Maps, Sequences and Genomes. CRC Press.
[54]
Kaizhong Zhang and Dennis E. Shasha. 1989.
[55]
Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems. SIAM J. Comput. 18, 6 (1989), 1245–1262.

Cited By

View all
  • (2023)Weighted Edit Distance Computation: Strings, Trees, and DyckProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585178(377-390)Online publication date: 2-Jun-2023
  • (2022)WRS: Workflow Retrieval System for Cloud Automatic RemediationNOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS54207.2022.9789843(1-10)Online publication date: 25-Apr-2022
  • (2022)Õ(n+poly(k))-time Algorithm for Bounded Tree Edit Distance2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS54457.2022.00071(686-697)Online publication date: Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
STOC 2019: Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing
June 2019
1258 pages
ISBN:9781450367059
DOI:10.1145/3313276
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. approximation algorithms
  2. fine-grained complexity
  3. graph algorithms
  4. randomized algorithms

Qualifiers

  • Research-article

Funding Sources

  • DARPA SIMPLEX
  • NSF AF:Medium
  • DARPA GRAPHS/AFOSR
  • NSF CAREER
  • NSF BIGDATA

Conference

STOC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Weighted Edit Distance Computation: Strings, Trees, and DyckProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585178(377-390)Online publication date: 2-Jun-2023
  • (2022)WRS: Workflow Retrieval System for Cloud Automatic RemediationNOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS54207.2022.9789843(1-10)Online publication date: 25-Apr-2022
  • (2022)Õ(n+poly(k))-time Algorithm for Bounded Tree Edit Distance2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS54457.2022.00071(686-697)Online publication date: Oct-2022
  • (2022)On the Hardness of Computing the Edit Distance of Shallow TreesString Processing and Information Retrieval10.1007/978-3-031-20643-6_21(290-302)Online publication date: 1-Nov-2022
  • (2021)New and improved algorithms for unordered tree inclusionTheoretical Computer Science10.1016/j.tcs.2021.06.013Online publication date: Jun-2021
  • (2021)Algorithmic TechniquesAlgorithms on Trees and Graphs10.1007/978-3-030-81885-2_2(45-83)Online publication date: 12-Oct-2021
  • (2020)Minimal Edit-Based Diffs for Large TreesProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412026(1225-1234)Online publication date: 19-Oct-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media