Abstract
Synthesizing large-scale phylogenetic trees is a fundamental problem in evolutionary biology. Median tree problems have evolved as a powerful tool to reconstruct such trees. Given a tree collection, these problems seek a median tree under some problem-specific tree distance. Here, we introduce the median tree problem for the classical path-difference distance. We prove that this problem is NP-hard, and describe a fast local search heuristic that is based on solving a local search problem exactly. For an effective heuristic we devise a time efficient algorithm for this problem that improves on the best-know (naïve) solution by a factor of n, where n is the size of the input trees. Finally, we demonstrate the performance of our heuristic in a comparative study with other commonly used methods that synthesize species trees using published empirical data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bansal, M.S., Burleigh, J.G., Eulenstein, O.: Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models. BMC Bioinform. 11(Suppl 1), S42 (2010)
Bansal, M.S., Burleigh, J.G., Eulenstein, O., Fernández-Baca, D.: Robinson-foulds supertrees. Algorithms Mol. Biol. 5(1), 1–12 (2010)
Bean, N.G., Kontoleon, N., Taylor, P.G.: Markovian trees: properties and algorithms. Ann. Oper. Res. 160(1), 31–50 (2007)
Bininda-Emonds, O.R. (ed.): Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Computational Biology, vol. 4. Springer, The Netherlands (2004)
Bluis, J., Shin, D.: Nodal distance algorithm: calculating a phylogenetic tree comparison metric. In: 3rd IEEE International Symposium on BioInformatics and BioEngineering (BIBE 2003), 10–12 March 2003 Bethesda, pp. 87–94. IEEE Computer Society (2003)
Bryant, D.: Hunting for trees in binary character sets: efficient algorithms for extraction, enumeration, and optimization. J. Comput. Biol. 3(2), 275–288 (1996)
Cardillo, M., Bininda-Emonds, O.R.P., Boakes, E., Purvis, A.: A species-level phylogenetic supertree of marsupials. J. Zool. 264, 11–31 (2004)
Chaudhari, R., Burleigh, G.J., Eulenstein, O.: Efficient algorithms for rapid error correction for gene tree reconciliation using gene duplications, gene duplication and loss, and deep coalescence. BMC Bioinform. 13(Suppl 10), S11 (2012)
Chaudhary, R., Bansal, M.S., Wehe, A., Fernández-Baca, D., Eulenstein, O.: iGTP: a software package for large-scale gene tree parsimony analysis. BMC Bioinform. 11, 574 (2010)
Chen, D., Eulenstein, O., Fernández-Baca, D., Burleigh, J.: Improved heuristics for minimum-flip supertree construction. Evol. Bioinform. 2, 347 (2006)
Cotton, J.A., Wilkinson, M.: Majority-rule supertrees. Syst. Biol. 56(3), 445–452 (2007)
Farris, J.: A successive approximations approach to character weighting. Syst. Zool. 18, 374–385 (1969)
Harris, S.R., Cartwright, E.J., Török, M.E., Holden, M.T., Brown, N.M., Ogilvy-Stuart, A.L., Ellington, M.J., Quail, M.A., Bentley, S.D., Parkhill, J., Peacock, S.J.: Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. Lancet Infect. Dis. 13(2), 130–136 (2013)
Hufbauer, R.A., Marrs, R.A., Jackson, A.K., Sforza, R., Bais, H.P., Vivanco, J.M., Carney, S.E.: Population structure, ploidy levels and allelopathy of Centaurea maculosa (spotted knapweed) and C. diffusa (diffuse knapweed) in North America and Eurasia. In: Proceedings of the XI International Symposium on Biological Control of Weeds, Canberra Australia, pp. 121–126. USDA Forest Service, Forest Health Technology Enterprise Team, Morgantown (2003)
Leaché, A.D.: Integrative and Comparative Biology. In: Hedges, S.B., Kumar, S. (eds.) The Timetree of Life, vol. 50(1), pp. 141–142. Oxford University Press, New York (2010)
Lin, H.T., Burleigh, J.G., Eulenstein, O.: Triplet supertree heuristics for the tree of life. BMC Bioinform. 10(Suppl 1), S8 (2009)
Lin, H.T., Burleigh, J.G., Eulenstein, O.: Consensus properties for the deep coalescence problem and their application for scalable tree search. BMC Bioinform. 13(Suppl 10), S12 (2012)
Maddison, W.P., Knowles, L.L.: Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55(1), 21–30 (2006)
Mir, A., Rosselló, F.: The mean value of the squared path-difference distance for rooted phylogenetic trees. CoRR abs/0906.2470 (2009)
Moran, S., Rao, S., Snir, S.: Using semi-definite programming to enhance supertree resolvability. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 89–103. Springer, Heidelberg (2005)
Nik-Zainal, S., et al.: The life history of 21 breast cancers. Cell 149(5), 994–1007 (2012)
Page, R.D.M.: Modified mincut supertrees. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, p. 537. Springer, Heidelberg (2002)
Page, R.D., Holmes, E.: Molecular Evolution: A Phylogenetic Approach. Blackwell Science, Boston (1998)
Phipps, J.B.: Dendogram topology. Syst. Zool. 20, 306–308 (1971)
Price, S.A., Bininda-Emonds, O.R.P., Gittleman, J.L.: A complete phylogeny of the Whales, Dolphins and even-toed hoofed mammals (cetartiodactyla). Biol. Rev. 80(3), 445–473 (2005)
Puigbò, P., Garcia-Vallvé, S., McInerney, J.O.: TOPD/FMTS: a new software to compare phylogenetic trees. Bioinformatics 23(12), 1556–1558 (2007)
Semple, C., Steel, M.A.: Phylogenetics. Oxford University Press, Oxford (2003)
Snir, S., Rao, S.: Quartets maxcut: a divide and conquer quartets algorithm. IEEE/ACM TCBB 7(4), 704–718 (2010)
Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. J. Classif. 9(1), 91–116 (1992)
Steel, M.A., Penny, D.: Distributions of tree comparison metrics - some new results. Syst. Biol. 42(2), 126–141 (1993)
Swofford, D.L.: PAUP*. Phylogenetic analysis using parsimony (and other methods), Version 4. Sinauer Associates, Sunderland, Massachusetts (2002)
Than, C., Nakhleh, L.: Species tree inference by minimizing deep coalescences. PLoS Comput. Biol. 5(9), e1000501 (2009)
Williams, W., Clifford, H.: On the comparison of two classifications of the same set of elements. Taxon 20(4), 519–522 (1971)
Acknowledgments
The authors would like to thank the two anonymous reviewers for their constructive comments that helped to improve the quality of this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Markin, A., Eulenstein, O. (2016). Path-Difference Median Trees. In: Bourgeois, A., Skums, P., Wan, X., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2016. Lecture Notes in Computer Science(), vol 9683. Springer, Cham. https://doi.org/10.1007/978-3-319-38782-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-38782-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-38781-9
Online ISBN: 978-3-319-38782-6
eBook Packages: Computer ScienceComputer Science (R0)