Abstract
Gene tree parsimony (GTP) problems infer species supertrees from a collection of rooted gene trees that are confounded by evolutionary events like gene duplication, gene duplication and loss, and deep coalescence. These problems are NP-complete, and consequently, they often are addressed by effective local search heuristics that perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. Still, GTP problems require rooted input gene trees; however, in practice, most phylogenetic methods infer unrooted gene trees and it may be difficult to root correctly. In this work, we (i) define the first local NNI search problems to solve heuristically the GTP equivalents for unrooted input gene trees, called unrooted GTP problems, and (ii) describe linear time algorithms for these local search problems. We implemented the first NNI based local search heuristics for unrooted GTP problems, which enable analyses for thousands of genes. Further, analysis of a large plant data set using the unrooted NNI search provides support for an intriguing new hypothesis regarding the evolutionary relationships among major groups of flowering plants.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bansal, M.S., Burleigh, J.G., Eulenstein, O., Wehe, A.: Heuristics for the Gene-Duplication Problem: A Θ(n) Speed-Up for the Local Search. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 238–252. Springer, Heidelberg (2007)
Bansal, M.S., Eulenstein, O.: An Ω(n 2/ logn) speed-up of TBR heuristics for the gene-duplication problem. IEEE/ACM TCBB 5(4), 514–524 (2008)
Bansal, M.S., Eulenstein, O., Wehe, A.: The gene-duplication problem: Near-linear time algorithms for NNI-based local searches. IEEE/ACM TCBB 6(2), 221–231 (2009)
Beiko, R.G., Doolittle, W.F., Charlebois, R.L.: The Impact of Reticulate Evolution on Genome Phylogeny. Systematic Biology 57(6), 844–856 (2008)
Bender, M.A., Farach-Colton, M.: The lca Problem Revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)
Bininda-Emonds, O.R.P.: Phylogenetic supertrees: combining information to reveal the tree of life (2004)
Bouchenak-Khelladi, Y., Salamin, N., Savolainen, V., Forest, F., Bank, M., Chase, M.W., Hodkinson, T.R.: Large multi-gene phylogenetic trees of the grasses (poaceae): progress towards complete tribal and generic level sampling. Mol. Phyl. Evol. 47(2), 488–505 (2008)
Burleigh, J.G., Bansal, M.S., Eulenstein, O., Hartmann, S., Wehe, A., Vision, T.J.: Genome-scale phylogenetics: inferring the plant tree of life from 18,896 discordant gene trees. Systematic Biology 60, 117–125 (2011)
Delsuc, F., Brinkmann, H., Philippe, H.: Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics 6(5), 361–375 (2005)
Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797 (2004)
Eulenstein, O., Huzurbazar, S., Liberles, D.A.: Reconciling phylogenetic trees. In: Dittmar, Liberles (eds.) Evolution After Gene Duplication. Wiley (2010)
Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28(2), 132–163 (1979)
Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e222 (2007)
Guigó, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution 6(2), 189–213 (1996)
Holland, B.R., Penny, D., Hendy, M.D.: Outgroup misplacement and phylogenetic inaccuracy under a molecular clock a simulation study. Syst. Biol. 52, 229–238 (2003)
Huelsenbeck, J.P., Bollback, J.P., Levine, A.M.: Inferring the Root of a Phylogenetic Tree. Systematic Biology 51(1), 32–43 (2002)
Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences 8, 275–282 (1992)
Kubatko, L.S., Degnan, J.H.: Inconsistency of Phylogenetic Estimates from Concatenated Data under Coalescence. Syst. Biol. 56(1), 17–24 (2007)
Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM Journal on Computing 30(3), 729–752 (2000)
Maddison, W.P.: Gene trees in species trees. Systematic Biology 46, 523–536 (1997)
Moore, M.J., Soltis, P.S., Bell, C.D., Burleigh, J.G., Soltis, D.E.: Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proceedings of the National Academy of Sciences 107(10), 4623–4628 (2010)
Mossel, E., Vigoda, E.: Phylogenetic MCMC algorithms are misleading on mixtures of trees. Science 309(5744), 2207–2209 (2005)
Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43(1), 58–77 (1994)
Qiu, Y., Li, L., Wang, B., Xue, J., Hendry, T.A., Li, R., Brown, J.W., Liu, Y., Hudson, G.T., Chen, Z.: Angiosperm phylogeny inferred from sequences of four mitochondrial genes. Journal of Systematics and Evolution 48(6), 391–425 (2010)
Rouard, M., Guignon, V., Aluome, C., Laporte, M., Droc, G., Walde, C., Zmasek, C.M., Périn, C., Conte, M.G.: Greenphyldb v2.0: comparative and functional genomics in plants. Nucleic Acids Research 39, D1095–D1102 (2010)
Sanderson, M., Michelle, M.: Inferring angiosperm phylogeny from est data with widespread gene duplication. BMC Evolutionary Biology 7(suppl.1) (2007)
Soltis, D.E., Smith, S.A., Cellinese, N., Wurdack, K.J., Tank, D.C., Brockington, S.F., Refulio-Rodriguez, N.F., Walker, J.B., Moore, M.J., Carlsward, B.S., Bell, C.D., Latvis, M., Crawley, S., Black, C., Diouf, D., Xi, Z., Rushworth, C.A., Gitzendanner, M.A., Sytsma, K.J., Qiu, Y., Hilu, K.W., Davis, C.C., Sanderson, M.J., Beaman, R.S., Olmstead, R.G., Judd, W.S., Donoghue, M.J., Soltis, P.S.: Angiosperm phylogeny: 17 genes, 640 taxa. American Journal of Botany 98(4), 704–730 (2011)
Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)
Yu, Y., Warnow, T., Nakhleh, L.: Algorithms for MDC-Based Multi-locus Phylogeny Inference. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 531–545. Springer, Heidelberg (2011)
Zhang, L.: From gene trees to species trees ii: Species tree inference by minimizing deep coalescence events. IEEE/ACM TCBB 8, 1685–1691 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Górecki, P., Burleigh, J.G., Eulenstein, O. (2012). GTP Supertrees from Unrooted Gene Trees: Linear Time Algorithms for NNI Based Local Searches. In: Bleris, L., Măndoiu, I., Schwartz, R., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2012. Lecture Notes in Computer Science(), vol 7292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30191-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-30191-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30190-2
Online ISBN: 978-3-642-30191-9
eBook Packages: Computer ScienceComputer Science (R0)