Skip to main content

Fast Algorithms for Inferring Gene-Species Associations

  • Conference paper
Book cover Bioinformatics Research and Applications (ISBRA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9096))

Included in the following conference series:

Abstract

Assessment of microbial biodiversity is typically made by sequencing either PCR-amplified marker genes or all genomic DNA from environmental samples. Both approaches rely on the similarity of the sequenced material to known entries in sequence databases. However, amplicons of non-marker genes are often used, when the research question aims at assessing both functional capabilities of a microbial community and its biodiversity. In such cases, a phylogenetic tree is constructed with known and metagenomic sequences, and expert assessment defines the taxonomic groups the amplicons belong to. Here, instead of relying on sequences, often missing, of non-marker genes, we use tree reconciliation to obtain a distribution of mappings between genes and species. We describe efficient algorithms for the reconstruction of gene-species mappings and a Monte-Carlo method for the inference of distributions for the cases when the number of optimal reconstructions is large. We provide a comparative study of different cost functions showing that the duplication-loss cost induces mappings of the highest quality. Further, we demonstrate the correctness of our approach using several datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  2. Arvestad, L., Lagergren, J., Sennblad, B.: The gene evolution model and computing its associated probabilities. Journal of ACM 56(2) (2009)

    Google Scholar 

  3. Bafna, V., Hannenhalli, S., Rice, K., Vawter, L.: Ligand-Receptor pairing via tree comparison. Journal of Computational Biology 7, 59–70 (2000)

    Article  Google Scholar 

  4. Berglund-Sonnhammer, A.-C., Steffansson, P., Betts, M.J., Liberles, D.A.: Optimal gene trees from sequences and species trees using a soft interpretation of parsimony. Journal of Molecular Evolution 63(2), 240–250 (2006)

    Article  Google Scholar 

  5. Bonizzoni, P., Vedova, G.D., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theoretical Computer Science 347(1-2), 36–53 (2005), doi:10.1016/j.tcs.2005.05.016

    Article  MATH  MathSciNet  Google Scholar 

  6. Dinsdale, E.A., et al.: Functional metagenomic profiling of nine biomes. Nature 452(7187), 629–632 (2008)

    Article  Google Scholar 

  7. Doyon, J.-P., Chauve, C., Hamel, S.: Space of gene/species tree reconciliations and parsimonious models. Journal of Computational Biology 16 (2009)

    Google Scholar 

  8. Durand, D., Halldórsson, B.V., Vernot, B.: A hybrid micro-macroevolutionary approach to gene tree reconstruction. Journal of Computational Biology 13(2), 320–335 (2006)

    Article  MathSciNet  Google Scholar 

  9. Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28(2), 132–163 (1979)

    Article  Google Scholar 

  10. Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: A unified approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(2), 522–536 (2013)

    Article  Google Scholar 

  11. Górecki, P., Tiuryn, J.: DLS-trees: A model of evolutionary scenarios. Theoretical Computer Science 359(1-3), 378–399 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  12. Hallett, M.T., Lagergren, J.: Efficient algorithms for lateral gene transfer problems. In: RECOMB, pp. 149–156 (2001)

    Google Scholar 

  13. Harding, E.F.: The probabilities of rooted tree-shapes generated by random bifurcation. Advances in Applied Probability 3(1), 44–77 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  14. Huson, D.H., Auch, A.F., Qi, J., Schuster, S.C.: MEGAN analysis of metagenomic data. Genome Research 17(3), 377–386 (2007)

    Article  Google Scholar 

  15. Lafond, M., Swenson, K.M., El-Mabrouk, N.: An optimal reconciliation algorithm for gene trees with polytomies. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 106–122. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Luton, P.E., Wayne, J.M., Sharp, R.J., Riley, P.W.: The mcrA gene as an alternative to 16S rRNA in the phylogenetic analysis of methanogen populations in landfill. Microbiology 148(11), 3521–3530 (2002)

    Google Scholar 

  17. Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM Journal on Computing 30(3), 729–752 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  18. Maddison, W.P.: Gene trees in species trees. Systematic Biology 46, 523–536 (1997)

    Article  Google Scholar 

  19. Matsen, F.A., Kodner, R.B., Armbrust, E.V.: pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 11(1), 538 (2010)

    Article  Google Scholar 

  20. O’Meara, B.C.: New heuristic methods for joint species delimitation and species tree inference. Systematic Biology 59, 59–73 (2010)

    Article  Google Scholar 

  21. Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)

    Google Scholar 

  22. Page, R.D.M., Charleston, M.A.: From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molecular Phylogenetics and Evolution 7, 231–240 (1997)

    Article  Google Scholar 

  23. Puigbo, P., Wolf, Y.I., Koonin, E.V.: The tree and net components of prokaryote evolution. Genome Biology and Evolution 2, 745–756 (2010)

    Article  Google Scholar 

  24. Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., Peplies, J., Glöckner, F.O.: The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Research 41(D1), D590–D596 (2013)

    Google Scholar 

  25. Sjöstrand, J., Tofigh, A., Daubin, V., Arvestad, L., Sennblad, B., Lagergren, J.: A Bayesian method for analyzing lateral gene transfer. Systematic Biology (2014)

    Google Scholar 

  26. Stark, M., Berger, S.A., Stamatakis, A., von Mering, C.: MLTreeMap - accurate maximum likelihood placement of environmental DNA sequences into taxonomic and functional reference phylogenies. BMC Genomics 11(1), 461 (2010)

    Article  Google Scholar 

  27. Stolzer, M., Lai, H., Xu, M., Sathaye, D., Vernot, B., Durand, D.: Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28(18), i409–i415 (2012)

    Google Scholar 

  28. Thompson, C.C., Thompson, F.L., Vandemeulebroecke, K., Hoste, B., Dawyndt, P., Swings, J.: Use of recA as an alternative phylogenetic marker in the family vibrionaceae. International Journal of Systematic and Evolutionary Microbiology 54(3), 919–924 (2004)

    Article  Google Scholar 

  29. Vernot, B., Stolzer, M., Goldman, A., Durand, D.: Reconciliation with non-binary species trees. Journal of Computational Biology 15(8), 981–1006 (2008)

    Article  MathSciNet  Google Scholar 

  30. Zhang, L.: From gene trees to species trees II: Species tree inference by minimizing deep coalescence events. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8, 1685–1691 (2011)

    Article  Google Scholar 

  31. Zhang, L., Cui, Y.: An efficient method for DNA-based species assignment via gene tree and species tree reconciliation. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 300–311. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  32. Zheng, Y., Zhang, L.: Reconciliation with non-binary gene trees revisited. In: Sharan, R. (ed.) RECOMB 2014. LNCS, vol. 8394, pp. 418–432. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arkadiusz Betkier .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Betkier, A., Szczęsny, P., Górecki, P. (2015). Fast Algorithms for Inferring Gene-Species Associations. In: Harrison, R., Li, Y., Măndoiu, I. (eds) Bioinformatics Research and Applications. ISBRA 2015. Lecture Notes in Computer Science(), vol 9096. Springer, Cham. https://doi.org/10.1007/978-3-319-19048-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19048-8_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19047-1

  • Online ISBN: 978-3-319-19048-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics