Abstract
We define, analyze, and give efficient algorithms for two kinds of distance measures for rooted and unrooted phylogenies. For rooted trees, our measures are based on the topologies the input trees induce on triplets; that is, on three-element subsets of the set of species. For unrooted trees, the measures are based on quartets (four-element subsets). Triplet and quartet-based distances provide a robust and fine-grained measure of the similarities between trees. The distinguishing feature of our distance measures relative to traditional quartet and triplet distances is their ability to deal cleanly with the presence of unresolved nodes, also called polytomies. For rooted trees, these are nodes with more than two children; for unrooted trees, they are nodes of degree greater than three.
Our first class of measures are parametric distances, where there is parameter that weighs the difference between an unresolved triplet/quartet topology and a resolved one. Our second class of measures are based on Hausdorff distance. Each tree is viewed as a set of all possible ways in which the tree could be refined to eliminate unresolved nodes. The distance between the original (unresolved) trees is then taken to be the Hausdorff distance between the associated sets of fully resolved trees, where the distance between trees in the sets is the triplet or quartet distance, as appropriate.
This work was supported in part by National Science Foundation AToL grant EF-0334832.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adams III, E.N.: N-trees as nestings: Complexity, similarity, and consensus. J. Classification 3(2), 299–317 (1986)
Allen, B., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals of Combinatorics 5, 1–13 (2001)
Barthélemy, J.P., McMorris, F.R.: The median procedure for n-trees. Journal of Classification 3, 329–334 (1986)
Bartholdi, J.J., Tovey, C.A., Trick, M.A.: Voting schemes for which it can be difficult to tell who won the election. Social Choice and Welfare 6, 157–165 (1989)
Berry, V., Jiang, T., Kearney, P.E., Li, M., Wareham, H.T.: Quartet cleaning: Improved algorithms and simulations. In: Nešetřil, J. (ed.) ESA 1999. LNCS, vol. 1643, pp. 313–324. Springer, Heidelberg (1999)
Bininda-Emonds, O.R.P. (ed.): Phylogenetic supertrees: Combining Information to Reveal the Tree of Life. Computational Biology Series, vol. 4. Springer, Heidelberg (2004)
Brodal, G.S., Fagerberg, R., Pedersen, C.N.S.: Computing the quartet distance in time O(n logn). Algorithmica 38(2), 377–395 (2003)
Bryant, D.: Building trees, hunting for trees, and comparing trees: Theory and methods in phylogenetic analysis. PhD thesis, Department of Mathematics, University of Canterbury, New Zealand (1997)
Bryant, D.: A classification of consensus methods for phylogenetics. In: Janowitz, M., Lapointe, F.-J., McMorris, F., Mirkin, B.B., Roberts, F. (eds.) Bioconsensus. Discrete Mathematics and Theoretical Computer Science, vol. 61, pp. 163–185. American Mathematical Society, Providence (2003)
Christiansen, C., Mailund, T., Pedersen, C.N., Randers, M., Stissing, M.S.: Fast calculation of the quartet distance between trees of arbitrary degrees. Algorithms for Molecular Biology 1(16) (2006)
Cotton, J.A., Slater, C.S., Wilkinson, M.: Discriminating supported and unsupported relationships in supertrees using triplets. Systematic Biology 55(2), 345–350 (2006)
Critchlow, D.E.: Metric Methods for Analyzing Partially Ranked Data. Lecture Notes in Statist, vol. 34. Springer, Berlin (1980)
Day, W.H.E.: Analysis of quartet dissimilarity measures between undirected phylogenetic trees. Systematic Zoology 35(3), 325–333 (1986)
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Tenth International World Wide Web Conference, Hong Kong, May 2001, pp. 613–622 (2001)
Fagin, R., Kumar, R., Mahdian, M., Sivakumar, D., Vee, E.: Comparing partial rankings. SIAM J. Discrete Math. 20(3), 628–648 (2006)
Finden, C.R., Gordon, A.D.: Obtaining common pruned trees. J. Classification 2(1), 225–276 (1985)
Maddison, W.P.: Reconstructing character evolution on polytomous cladograms. Cladistics 5, 365–377 (1989)
McMorris, F.R., Meronk, D.B., Neumann, D.A.: A view of some consensus methods for trees. In: Felsenstein, J. (ed.) Numerical Taxonomy, pp. 122–125. Springer, Heidelberg (1983)
Piel, W., Sanderson, M., Donoghue, M., Walsh, M.: Treebase (last accessed, February 2, 2007), http://www.treebase.org
Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Mathematical Biosciences 53, 131–147 (1981)
Semple, C., Steel, M.: Phylogenetics. Oxford Lecture Series in Mathematics. Oxford University Press, Oxford (2003)
Snir, S., Rao, S.: Using max cut to enhance rooted trees consistency. IEEE/ACM Trans. Comput. Biol. Bioinformatics 3(4), 323–333 (2006)
Steel, M., Penny, D.: Distributions of tree comparison metrics — some new results. Systematic Biology 42(2), 126–141 (1993)
Stissing, M., Pedersen, C.N.S., Mailund, T., Brodal, G.S., Fagerberg, R.: Computing the quartet distance between evolutionary trees of bounded degree. In: Sankoff, D., Wang, L., Chin, F. (eds.) APBC. Advances in Bioinformatics and Computational Biology, vol. 5, pp. 101–110. Imperial College Press (2007)
Stockham, C., Wang, L.-S., Warnow, T.: Statistically based postprocessing of phylogenetic analysis by clustering. In: ISMB, pp. 285–293 (2002)
Vazirani, V.V.: Approximation Algorithms. Springer, Berlin (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bansal, M.S., Dong, J., Fernández-Baca, D. (2008). Comparing and Aggregating Partially Resolved Trees. In: Laber, E.S., Bornstein, C., Nogueira, L.T., Faria, L. (eds) LATIN 2008: Theoretical Informatics. LATIN 2008. Lecture Notes in Computer Science, vol 4957. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78773-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-78773-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78772-3
Online ISBN: 978-3-540-78773-0
eBook Packages: Computer ScienceComputer Science (R0)