Skip to main content

Accumulated Coalescence Rank and Excess Gene Count for Species Tree Inference

  • Conference paper
  • First Online:
Book cover Algorithms for Computational Biology (AlCoB 2016)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9702))

Included in the following conference series:

Abstract

We propose a novel summary based method to infer species trees from input multi-locus gene trees with incomplete lineage sorting (ILS). The method extends an existing technique called STAR [13], which defines average coalescence rank between taxa pairs (couplets), to derive species trees using Neighbor-Joining (NJ) [20, 23]. Such coalescence rank, however, is ambiguous at couplet level. We propose two new couplet based distance measures, termed as accumulated coalescence rank (AcR), and excess gene tree leaves (XL), and show that their combination discriminates individual couplets better. We propose a new method AcRNJXL, which uses the proposed measures, for NJ based species tree construction. Results show that for biological datasets, AcRNJXL produces much better performance than STAR and other reference approaches, with the same time and space complexities as STAR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Chaudhary, R., Bansal, M.S., Wehe, A., Fernández-Baca, D., Eulenstein, O.: iGTP: a software package for large-scale gene tree parsimony analysis. BMC Bioinform. 23(574), 1–7 (2010)

    Google Scholar 

  2. Chaudhary, R., Burleigh, J.G., Fernández-Baca, D.: Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance. Algorithms Mol. Biol. 8:28(1), 1–12 (2013)

    Google Scholar 

  3. Chiari, Y., Cahais, V., Galtier, N., Delsuc, F.: Phylogenomic analyses support the position of turtles as the sister group of birds and crocodiles (archosauria). BMC Biol. 10(65), 1–14 (2012)

    Google Scholar 

  4. DeGiorgio, M., Degnan, J.: Robustness to divergence time underestimation when inferring species trees from estimated gene trees. Syst. Biol. 63(1), 66–82 (2014)

    Article  Google Scholar 

  5. Degnan, J.H., Rosenberg, N.A.: Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 24(6), 332–340 (2009)

    Article  Google Scholar 

  6. Heled, J., Drummond, A.J.: Bayesian inference of species trees from multilocus data. Mol. Biol. Evol. 27(3), 570–580 (2010)

    Article  Google Scholar 

  7. Kubatko, L.S., Carstens, B.C., Knowles, L.: Stem: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 25(7), 971–973 (2009)

    Article  Google Scholar 

  8. Larget, B.R., Kotha, S.K., Dewey, C.N., Ané, C.: Bucky: Gene tree/species tree reconciliation with bayesian concordance analysis. Bioinformatics 26(22), 2910–2911 (2010)

    Article  Google Scholar 

  9. Liu, L.: Best: bayesian estimation of species trees under the coalescent model. Bioinformatics 24(21), 2542–2543 (2008)

    Article  Google Scholar 

  10. Liu, L., Xi, Z., Wu, S., Davis, C.C., Edwards, S.V.: Estimating phylogenetic trees from genome-scale data. Ann. N. Y. Acad. Sci. 1360(1), 36–53 (2015)

    Article  Google Scholar 

  11. Liu, L., Yu, L.: Estimating species trees from unrooted gene trees. Syst. Biol. 60(5), 661–667 (2011)

    Article  Google Scholar 

  12. Liu, L., Yu, L., Edwards, S.V.: A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol. Biol. 10(302), 1–18 (2010)

    Google Scholar 

  13. Liu, L., Yu, L., Pearl, D.K., Edwards, S.V.: Estimating species phylogenies using coalescence times among sequences. Syst. Biol. 58(5), 468–477 (2009)

    Article  Google Scholar 

  14. Mirarab, S., Bayzid, M.S., Warnow, T.: Evaluating summary methods for multilocus species tree estimation in the presence of incomplete lineage sorting. Syst. Biol. p. syu063 (2014). doi:10.1093/sysbio/syu063

  15. Mirarab, S., Reaz, R., Bayzid, M.S., Zimmermann, T., Swenson, M.S., Warnow, T.: Astral: genome-scale coalescent-based species tree estimation. Bioinformatics 30(17), i541–i548 (2014)

    Article  Google Scholar 

  16. Mirarab, S., Warnow, T.: ASTRAL-II: coalescent-based species tree estim-ation with many hundreds of taxa and thousands of genes. Bioinformatics 31(12), i44–i52 (2015)

    Article  Google Scholar 

  17. Mossel, E., Roch, S.: Incomplete lineage sorting: consistent phylogeny estimation from multiple loci. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(1), 166–171 (2010)

    Article  Google Scholar 

  18. Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol. Evol. 28(12), 719–728 (2013)

    Article  Google Scholar 

  19. Roch, S., Steel, M.: Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent. Theor. Population Biol. 100, 56–62 (2015)

    Article  MATH  Google Scholar 

  20. Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4(4), 406–425 (1987)

    Google Scholar 

  21. Song, S., Liu, L., Edwards, S.V., Wu, S.: Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc. Nat. Acad. Sci. USA 109(37), 14942–14947 (2012)

    Article  Google Scholar 

  22. Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)

    Article  Google Scholar 

  23. Studier, J.A., Keppler, K.L.: A note on the neighbor-joining algorithm of saitou and nei. Mol. Biol. Evol. 5(6), 729–731 (1988)

    Google Scholar 

  24. Sukumaran, J., Holder, M.T.: DendroPy: a python library for phylogenetic computing. Bioinformatics 26(12), 1569–1571 (2000)

    Article  Google Scholar 

  25. Than, C., Nakhleh, L.: Species tree inference by minimizing deep coalescences. PLOS Comput. Biol. 5(9), 1–12 (2009)

    Article  MathSciNet  Google Scholar 

  26. Wickett, N.J., et al.: Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc. Nat. Acad. Sci. USA 111(45), E4859–E4868 (2014)

    Article  Google Scholar 

  27. Xi, Z., Liu, L., Rest, J.S., Davis, C.C.: Coalescent versus concatenation methods and the placement of amborella as sister to water lilies. Syst. Biol. 63(6), 919–932 (2014)

    Article  Google Scholar 

  28. Yu, Y., Warnow, T., Nakhleh, L.: Algorithms for MDC-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles. J. Comput. Biol. 18(11), 1543–1559 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The first author acknowledges Tata Consultancy Services (TCS) for providing the research scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sourya Bhattacharyya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bhattacharyya, S., Mukhopadhyay, J. (2016). Accumulated Coalescence Rank and Excess Gene Count for Species Tree Inference. In: Botón-Fernández, M., Martín-Vide, C., Santander-Jiménez, S., Vega-Rodríguez, M.A. (eds) Algorithms for Computational Biology. AlCoB 2016. Lecture Notes in Computer Science(), vol 9702. Springer, Cham. https://doi.org/10.1007/978-3-319-38827-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-38827-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-38826-7

  • Online ISBN: 978-3-319-38827-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics