Abstract
Given a collection of phylogenetic trees on the same leaf label-set, the Maximum Agreement Forest problem (Maf) asks for a largest common subforest of these trees. The Maf problem on two binary phylogenetic trees has been studied extensively. In this paper, we are focused on the Maf problem on multiple (i.e., two or more) binary phylogenetic trees and present two polynomial-time approximation algorithms, one for the Maf problem on multiple rooted trees, and the other for the Maf problem on multiple unrooted trees. The ratio of our algorithm for the Maf problem on multiple rooted trees is 3, which is an improvement over the previous best ratio 8 for the problem. Our approximation algorithm of ratio 4 for the Maf problem on multiple unrooted trees is the first constant ratio approximation algorithm for the problem.



Similar content being viewed by others
Notes
Some definitions in the study of maximum agreement forests have been somewhat confusing and misleading. If size denotes the number of edges in a forest, then the size of a forest is equal to the number of vertices minus its order. Thus, when the number of vertices is fixed, a forest of large size implies a small order. The terminology of “maximum agreement forest” means an agreement forest of the maximum size. However, as it has been studied in the literature, the maximum agreement forest problem is indeed a minimization problem, with the objective of minimizing the order of an agreement forest.
The indices used here are slightly different from that used in the algorithm Apx-MAF: in the algorithm Apx-MAF, step 2 operates on \(F_1\) and \(F_{i+1}\) for \(1 \le i \le m-1\), which simplifies the notations in the proof of Theorem 1; while in this section, we let step 2 of the algorithm operate on \(F_1\) and \(F_i\) for \(2 \le i \le m\) to simplify the descriptions of our meta-steps.
During the preparation of the final version of this manuscript, the authors were informed by an anonymous referee that Mukhopadhyay and Bhabak had announced an \(O(kn^5)\)-time approximation algorithm of ratio 3 for the Maf problem on k rooted binary phylogenetic trees [16].
References
Aho, A., Hopcroft, J., Ullman, J.: The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading (1974)
Allen, B., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Comb. 5(1), 1–15 (2001)
Beiko, R.G., Hamilton, N.: Phylogenetic identification of lateral genetic transfer events. BMC Evolut. Biol. 6(1), 15 (2006)
Bonet, M., John, K.S., Mahindru, R., Amenta, N.: Approximating subtree distances between phylogenies. J. Comput. Biol. 13(8), 1419–1434 (2006)
Bordewich, M., McCartin, C., Semple, C.: A 3-approximation algorithm for the subtree distance between phylogenies. J. Discrete Algorithms 6(3), 458–471 (2008)
Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Comb. 8(4), 409–423 (2005)
Buneman, P.: The recovery of trees from measures of dissimilarity. In: Kendall, D., Tauta, P. (eds.) Mathematics in the Archaeological and Historical Sciences, pp. 387–395. Edinburgh University Press, Edinburgh (1971)
Chataigner, F.: Approximating the maximum agreement forest on \(k\) trees. Inf. Process. Lett. 93, 239–244 (2005)
Chen, J., Fan, J.-H., Sze, S.-H.: Parameterized and approximation algorithms for maximum agreement forest in multifurcating trees. Theor. Comput. Sci. 562, 496–512 (2015)
Chen, Z., Wang, L.: Algorithms for reticulate networks of multiple phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 372–384 (2012)
Diestel, R.: Graph Theory, 4th edition. Graduate Texts in Mathematics, vol. 173. Springer, Heidelberg (2010)
Dudas, G., Bedford, T., Lycett, S., Rambaut, A.: Reassortment between influenza B lineages and the emergence of a coadapted PB1-PB2-HA gene complex. Mol. Biol. Evol. 32(1), 162–172 (2014). (supplemental information)
Hallett, M., McCartin, C.: A faster FPT algorithm for the maximum agreement forest problem. Theory Comput. Syst. 41(3), 539–550 (2007)
Hein, J., Jiang, T., Wang, L., Zhang, K.: On the complexity of comparing evolutionary trees. Discrete Appl. Math. 71, 153–169 (1996)
Li, M., Tromp, J., Zhang, L.: On the nearest neighbour interchange distance between evolutionary trees. J. Theor. Biol. 182(4), 463–467 (1996)
Mukhopadhyay, A., Bhabak, P.: A 3-factor approximation algorithm for a minimum acyclic agreement forest on \(k\) rooted, binary phylogenetic trees. CoRR abs/1407.7125 (2014)
Robinson, D., Foulds, L.: Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981)
Rodrigues, M., Sagot, M., Wakabayashi, Y.: Some approximation results for the maximum agreement forest problem. In: Proceedigs of the RANDOM-APPROX 2001, Lecture Notes in Computer Science, vol. 2129, pp. 159–169 (2001)
Rodrigues, E., Sagot, M., Wakabayashi, Y.: The maximum agreement forest problem: approximation algorithms and computational experiments. Theor. Comput. Sci. 374, 91–110 (2007)
Shi, F., Wang, J., Chen, J., Feng, Q., Guo, J.: Algorithms for parameterized maximum agreement forest problem on multiple trees. Theor. Comput. Sci. 554, 207–216 (2014)
Shi, F., Feng, Q., You, J., Wang, J.: Improved approximation algorithm for maximum agreement forest of two rooted binary phylogenetic trees. J. Comb. Optim. (2015a). doi:10.1007/s10878-015-9921-7
Shi, F., Wang, J., Yang, Y., Feng, Q., Li, W., Chen, J.: A fixed-parameter algorithm for the maximum agreement forset problem on multifurcating trees. Sci. China Inf. Sci. (2015b). doi:10.1007/s11432-015-5355-1
Swofford, D., Olsen, G., Waddell, P., Hillis, D.: Phylogenetic inference. In: Hillis, D., Moritz, D., Mable, B. (eds.) Molecular Systematics, 2nd edn, pp. 407–514. Sinauer Associates, Sunderiand (1996)
Whidden, C., Zeh, N.: A unifying view on approximation and FPT of agreement forests. In: Proceedings of the WABI 2009, Lecture Notes in Computer Science, vol. 5724, pp. 390–401 (2009)
Whidden, C., Beiko, R.G., Zeh, N.: Fixed-parameter algorithms for maximum agreement forests. SIAM J. Comput. 42(4), 1431–1466 (2013)
Whidden, C., Zeh, N., Beiko, R.G.: Supertrees based on the subtree prune-and-regraft distance. Syst. Biol. 63(4), 566–581 (2014)
Whidden, C., Matsen IV, F.A.: Quantifying MCMC exploration of phylogenetic tree space. Syst. Biol. 64(3), 472 (2015)
Wu, Y.: Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees. Bioinformatics 26(12), i140–i148 (2010)
Acknowledgments
We would like to thank the anonymous referees, whose comments and suggestions have greatly improved the presentation of this paper. In particular, a referee provided further pointers to applications of algorithms for maximum agreement forests on multiple trees, and another referee updated us of the status of approximation algorithms for maximum agreement forests on multiple rooted trees.
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this work was reported in the Proceedings of the 20th International Computing and Combinatorics Conference, Lecture Notes in Computer Science, vol. 8591, pp. 381–392, 2014. This work is supported by the National Natural Science Foundation of China under Grants (61232001, 61472449, 61370172, 61420106009), the Major Science and Technology Research Program for Strategic Emerging Industry of Hunan (Grant No. 2012GK4054), and the Research Fund for the Doctoral Program of Higher Education of China (NO. 20130162130001).
Rights and permissions
About this article
Cite this article
Chen, J., Shi, F. & Wang, J. Approximating Maximum Agreement Forest on Multiple Binary Trees. Algorithmica 76, 867–889 (2016). https://doi.org/10.1007/s00453-015-0087-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-015-0087-6