Abstract
Tree metrics that compare pairs of trees are an elementary tool for analyzing phylogenetic trees. The cophenetic distance is a classic vector-based tree metric introduced by Cardona et al. that originates from the pioneering work of Sokal and Rohlf more than 50 years ago. However, when faced with phylogenetic analyses where sets of large-scale trees are compared, the quadratic runtime of the current best-known (naïve) algorithm to compute the cophenetic distance becomes prohibitive. Here we describe an algorithmic framework that computes the cophenetic distance under the \(L_1\)-norm in \(O(n \log ^2 n)\) time, where n is the size of the compared pair of trees. Based on the work from Sokal and Rohlf, we introduce a natural class of cophenetic distances and show that our algorithmic framework can compute each member of this class in \(O(n \log ^2 n)\) time. In addition, we present a modification of this framework for computing these distances under the \(L_2\)-norm in \(O(n \log n)\) time. Finally, we demonstrate the scalability of our algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Comb. 5(1), 1–15 (2001)
Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Comb. 8(4), 409–423 (2005)
Bourque, M.: Arbres de Steiner et réseaux dont varie l’emplagement de certains sommets. Ph.D. thesis, University of Montréal Montréal, Canada (1978)
Bryant, D.: Hunting for trees, building trees and comparing trees: theory and method in phylogenetic analysis. Ph.D. thesis, University of Canterbury, New Zealand (1997)
Cardona, G., Mir, A., Rosselló, F., Rotger, L.: The expected value of the squared cophenetic metric under the yule and the uniform models. Math. Biosci. 295, 73–85 (2018)
Cardona, G., Mir, A., Rosselló, F., Rotger, L., Sánchez, D.: Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf. BMC Bioinform. 14(1), 3 (2013)
Critchlow, D., Pearl, D., Qian, C.: The triples distance for rooted bifurcating phylogenetic trees. Syst. Biol. 45, 323–334 (1996)
DasGupta, B., et al.: On distances between phylogenetic trees. In: SODA, vol. 97, pp. 427–436 (1997)
Estabrook, G., McMorris, F., Meacham, C.: Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst. Zool. 34, 193–200 (1985)
Eulenstein, O., Huzurbazar, S., Liberles, D.: Reconciling phylogenetic trees. In: Evolution After Gene Duplication. Wiley, Hoboken (2010)
Felsenstein, J.: Inferring Phylogenies. Sinauer Associates, Inc., Sunderland (2004)
Forster, P., Renfrew, C.: Phylogenetic Methods and the Prehistory of Languages. McDonald Inst of Archeological, Cambridge (2006)
Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 522–536 (2013)
Harris, S., et al.: Whole-genome sequencing for analysis of an outbreak of meticillin-resistant staphylococcus aureus: a descriptive study. Lancet. Infect. Dis. 13(2), 130–136 (2013)
Hein, J.: Reconstructing evolution of sequences subject to recombination using parsimony. Math. Biosci. 98(2), 185–200 (1990)
Hein, J., et al.: On the complexity of comparing evolutionary trees. Discrete Appl. Math. 71(1–3), 153–169 (1996)
Hickey, G., et al.: SPR distance computation for unrooted trees. Evol. Bioinform. online 4, 17–27 (2008)
Hoef-Emden, K.: Molecular phylogenetic analyses and real-life data. Comput. Sci. Eng. 7(3), 86–91 (2005)
Katherine, S.J.: Review paper: the shape of phylogenetic treespace. Syst. Biol. 66(1), e83–e94 (2017)
Kendall, M., Colijn, C.: Mapping phylogenetic trees to reveal distinct patterns of evolution. Mol. Biol. Evol. 33(10), 2735–2743 (2016)
Kuhner, M.K., Yamato, J.: Practical performance of tree comparison metrics. Syst. Biol. 64(2), 205–214 (2015)
Li, M., Tromp, J., Zhang, L.: On the nearest neighbour interchange distance between evolutionary trees. J. Theor. Biol. 182(4), 463–467 (1996)
Markin, A., Eulenstein, O.: Cophenetic median trees under the manhattan distance. In: ACM-BCB 2017, pp. 194–202. ACM, New York (2017)
Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981)
Roux, J., et al.: Resolving the native provenance of invasive fireweed (Senecio madagascariensis Poir.) in the Hawaiian Islands as inferred Poir.) in the Hawaiian Islands as inferred from phylogenetic analysis. Div. Distr. 12, 694–702 (2006)
Sand, A., et al.: Algorithms for computing the triplet and quartet distances for binary and general trees. Biology 2(4), 1189–1209 (2013)
Semple, C., Steel, M.A.: Phylogenetics. University Press, Oxford (2003)
Sokal, R.R., Rohlf, F.J.: The comparison of dendrograms by objective methods. Taxon 11(2), 33–40 (1962)
Steel, M.A., Penny, D.: Distributions of tree comparison metrics. Syst. Biol. 42(2), 126–141 (1993)
Williams, W., Clifford, H.: On the comparison of two classifications of the same set of elements. Taxon 20(4), 519–522 (1971)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Górecki, P., Markin, A., Eulenstein, O. (2018). Cophenetic Distances: A Near-Linear Time Algorithmic Framework. In: Wang, L., Zhu, D. (eds) Computing and Combinatorics. COCOON 2018. Lecture Notes in Computer Science(), vol 10976. Springer, Cham. https://doi.org/10.1007/978-3-319-94776-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-94776-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94775-4
Online ISBN: 978-3-319-94776-1
eBook Packages: Computer ScienceComputer Science (R0)