Skip to main content

New Algorithms for Computing Phylogenetic Biodiversity

  • Conference paper
Algorithms in Bioinformatics (WABI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8701))

Included in the following conference series:

Abstract

A common problem that appears in many case studies in ecology is the following: given a rooted phylogenetic tree \(\mathcal{T}\) and a subset R of its leaf nodes, we want to compute the distance between the elements in R. A very popular distance measure that can be used for this reason is the Phylogenetic Diversity (PD), which is defined as the cost of the minimum weight Steiner tree in \(\mathcal{T}\) that spans the nodes in R. To analyse the value of the PD for a given set R it is important also to calculate the variance of this measure. However, the best algorithm known so far for computing the variance of the PD is inefficient; for any input tree \(\mathcal{T}\) that consists of n nodes, this algorithm has Θ(n 2) running time. Moreover, computing efficiently the variance and higher order statistical moments is a major open problem for several other phylogenetic measures. We provide the following results:

  • We describe a new algorithm that computes efficiently in practice the variance of the pd. This algorithm has O(si(\(\mathcal{T}\)) + DSSI \(^2(\mathcal{T}))\) running time; here si(\(\mathcal{T}\)) denotes the Sackin’s Index of \(\mathcal{T}\), and DSSI \((\mathcal{T})\) is a new index whose value depends on how balanced \(\mathcal{T}\) is.

  • We provide for the first time exact formulas for computing the mean and the variance of another popular biodiversity measure, the Mean Nearest Taxon Distance (mntd). These formulas apply specifically to ultrametric trees. For an ultrametric tree \(\mathcal{T}\) of n nodes, we show how we can compute the mean of the mntd in O(n) time, and its variance in O(si(\(\mathcal{T}\)) + DSSI \(^2(\mathcal{T}))\) time.

  • We introduce a new measure which we call the Core Ancestor Cost  (cac). A major advantage of this measure is that for any integer k > 0 we can compute all first k statistical moments of the cac in O(si(\(\mathcal{T}) +nk+k^2)\) time in total, using O(n + k) space.

We have implemented the new algorithms for computing the variance of the pd and of the mntd, and the statistical moments of the cac. We conducted experiments on large phylogenetic datasets and we show that our algorithms perform efficiently in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bininda-Emonds, O.R.P., Cardillo, M., Jones, K.E., MacPhee, R.D.E., Beck, R.M.D., Grenyer, R., Price, S.A., Vos, R.A., Gittleman, J.L., Purvis, A.: The Delayed Rise of Present-Day Mammals. Nature 446, 507–512 (2007)

    Article  Google Scholar 

  2. Blum, M.G.B., François, O.: On Statistical Tests of Phylogenetic Tree Imbalance: The Sackin and Other Indices Revisited. Mathematical Biosciences 195, 14–153 (2005)

    Article  Google Scholar 

  3. Cadotte, M., Albert, C.H., Walker, S.C.: The Ecology of Differences: Assessing Community Assembly with Trait and Evolutionary Distances. Ecology Letters 16, 1234–1244 (2013)

    Article  Google Scholar 

  4. Cooper, N., Rodriguez, J., Purvis, A.: A Common Tendency for Phylogenetic Overdispersion in Mammalian Assemblages. Proceedings of the Royal Society B 275, 2031–2037 (2008)

    Article  Google Scholar 

  5. O’Dwyer, J.P., Kembel, S.W., Green, J.L.: Phylogenetic Diversity Theory Sheds Light on the Structure of Microbial Communities. PLoS Computational Biology 8(12), e1002832(2012)

    Google Scholar 

  6. Faller, B., Pardi, F., Steel, M.: Distribution of Phylogenetic Diversity Under Random Extinction. Journal of Theoretical Biology 251, 286–296 (2008)

    Article  MathSciNet  Google Scholar 

  7. Goloboff, P.A., Catalano, S.A., Mirandeb, J.M., Szumika, C.A., Ariasa, J.S., Kallersjoc, M., Farris, J.S.: Phylogenetic Analysis of 73 060 Taxa Corroborates Major Eukaryotic Groups. Cladistics 25, 211–230 (2009)

    Article  Google Scholar 

  8. Graham, C.H., Parra, J.L., Rahbek, C., McGuire, J.A.: Phylogenetic Structure in Tropical Hummingbird Communities. Proceedings of the National Academy of Sciences USA 106, 19673–19678 (2009)

    Article  Google Scholar 

  9. Kembel, S.W., Hubbell, S.P.: The Phylogenetic Structure of a Neotropical Forest Tree Community. Ecology 87, S86–S99 (2006)

    Google Scholar 

  10. Kissling, W.D., Eiserhardt, W.L., Baker, W.J., Borchsenius, F., Couvreur, T.L.P., Balslev, H., Svenning, J.-C.: Cenozoic Imprints on the Phylogenetic Structure of Palm Species Assemblages Worldwide. Proceedings of the National Academy of Sciences USA 109, 7379–7384 (2012)

    Article  Google Scholar 

  11. Kraft, N.J.B., Cornwell, W.K., Webb, C.O., Ackerly, D.D.: Trait Evolution, Community Assembly, and the Phylogenetic Structure of Ecological Communities. The American Naturalist 170, 271–283 (2007)

    Article  Google Scholar 

  12. Nipperess, D.A., Matsen IV., F.A.: The Mean and Variance of Phylogenetic Diversity Under Rarefaction. Methods in Ecology and Evolution 4, 566–572 (2013)

    Article  Google Scholar 

  13. Steel, M.: Tools to Construct and Study Big Trees: A Mathematical Perspective. In: Hodkinson, T., Parnell, J., Waldren, S. (eds.) Reconstructing the Tree of Life: Taxonomy and Systematics of Species Rich Taxa, pp. 97–112. CRC Press (2007)

    Google Scholar 

  14. Tsirogiannis, C., Sandel, B.: Computing the skewness of the phylogenetic mean pairwise distance in linear time. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 170–184. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  15. Tsirogiannis, C., Sandel, B., Cheliotis, D.: Efficient computation of popular phylogenetic tree measures. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 30–43. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Vellend, M., Cornwell, W.K., Magnuson-Ford, K., Mooers, A.Ø.: Measuring Phylogenetic Biodiversity. In: Magurran, A., McGill, B. (eds.) Biological Diversity: Frontiers in Measurement and Assessment, Oxford University Press (2010)

    Google Scholar 

  17. Webb, C.O., Ackerly, D.D., McPeek, M.A., Donoghue, M.J.: Phylogenies and Community Ecology. Annual review of ecology and systematics 33, 475–505 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsirogiannis, C., Sandel, B., Kalvisa, A. (2014). New Algorithms for Computing Phylogenetic Biodiversity. In: Brown, D., Morgenstern, B. (eds) Algorithms in Bioinformatics. WABI 2014. Lecture Notes in Computer Science(), vol 8701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44753-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44753-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44752-9

  • Online ISBN: 978-3-662-44753-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics