Skip to main content

A Robinson-Foulds Measure to Compare Unrooted Trees with Rooted Trees

  • Conference paper
Bioinformatics Research and Applications (ISBRA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7292))

Included in the following conference series:

Abstract

The Robinson-Foulds (RF) distance is a well-established measure for the comparative analysis of phylogenetic trees, and comparing unrooted gene trees with rooted species trees is a standard application in phylogenetics. However, the RF distance is not defined to compare unrooted trees with rooted trees. Here we propose urRF a new measure based on the standard RF distance to compare an unrooted tree with a rooted tree, and describe a linear time algorithm to compute urRF. Further, we show that our measure relates to several intriguing properties that can be crucial for the in-depth comparative study and the credible rooting of trees. Finally, we present detailed empirical studies based on real data sets demonstrating the performance of our novel RF measure, the efficiency of our linear time algorithm and the quality of determining optimal rootings of unrooted gene trees.

Support was provided to PG by the grant of MNiSW #N N301 065236, to OE by the NSF (#0830012 and #106029), and to PG and OE by NCN #2011/01/B/ST6/02777 and the NIMBioS Working Group: Gene Tree Reconciliation through NSF #EF-0832858.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bansal, M.S., Burleigh, J.G., Eulenstein, O., Fernández-Baca, D.: Robinson-foulds supertrees. Algorithms for Molecular Biology 5, 18 (2010)

    Article  Google Scholar 

  2. Bender, M.A., Farach-Colton, M.: The lca Problem Revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bourque, M.: Abres de Steiner et reseaux dont varie I’emplagement de certains sommnets. PhD thesis, Department d’Informatique, University Montréal, Montréal (1978)

    Google Scholar 

  5. Bryant, D., Steel, M.: Computing the distribution of a tree metric. IEEE/ACM Trans. Comput. Biol. Bioinform. 6(3), 420–426 (2009)

    Article  Google Scholar 

  6. Bryant, D., Tsang, J., Kearney, P.E., Li, M.: Computing the quartet distance between evolutionary trees. In: Symposium on Discrete Algorithms, pp. 285–286 (2000)

    Google Scholar 

  7. Chaudhary, R., Burleigh, J.G., Fernández-Baca, D.: Fast Local Search for Unrooted Robinson-Foulds Supertrees. In: Chen, J., Wang, J., Zelikovsky, A. (eds.) ISBRA 2011. LNCS, vol. 6674, pp. 184–196. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Chen, F., Mackey, A.J., Stoeckert, C.J., Roos, D.S.: Orthomcl-db: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Research 34(suppl.1), D363–D368

    Google Scholar 

  9. DasGupta, B., He, X., Jiang, T., Li, M., Tromp, J., Zhang, L.: On distances between phylogenetic trees. In: SODA, pp. 427–436 (1997)

    Google Scholar 

  10. Day, W.H.E.: Optimal algorithms for comparing trees with labeled leaves. Journal of Classification 2(1), 7–28 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  11. Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797 (2004)

    Article  Google Scholar 

  12. Eulenstein, O., Mirkin, B., Vingron, M.: Duplication-based measures of difference between gene and species trees. J. Comput. Biol. 5(1), 135–148 (1998)

    Article  Google Scholar 

  13. Felsenstein, J.: Inferring phylogenies. Sinauer Associates (2004)

    Google Scholar 

  14. Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28(2), 132–163 (1979)

    Article  Google Scholar 

  15. Górecki, P., Tiuryn, J.: Dls-trees: A model of evolutionary scenarios. Theor. Comput. Sci. 359(1-3), 378–399 (2006)

    Article  MATH  Google Scholar 

  16. Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e222 (2007)

    Article  Google Scholar 

  17. Guindon, S., Delsuc, F., Dufayard, J., Gascuel, O.: Estimating maximum likelihood phylogenies with PhyML. Methods Mol. Biol. 537, 113–137 (2009)

    Article  Google Scholar 

  18. Mecham, C.A.: Theoretical and computational considerations of the compatibility of qualitative taxonomic characters. NATO ASI Series, vol. 1, pp. 304–314. Springer, Berlin (1983)

    Google Scholar 

  19. Mirkin, B., Muchnik, I.B., Smith, T.F.: A biologically consistent model for comparing molecular phylogenies. J. Comput. Biol. 2(4), 493–507 (1995)

    Article  Google Scholar 

  20. Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43(1), 58–77 (1994)

    Google Scholar 

  21. Pattengale, N.D., Gottlieb, E.J., Moret, B.M.E.: Efficiently computing the robinson-foulds metric. J. Comput. Biol. 14(6), 724–735 (2007)

    Article  MathSciNet  Google Scholar 

  22. Robinson, D.F., Foulds, L.R.: Comparison of weighted labelled trees. Lecture Notes in Mathematics, vol. 748, pp. 119–126 (1979)

    Google Scholar 

  23. Sanderson, M., McMahon, M.: Inferring angiosperm phylogeny from est data with widespread gene duplication. BMC Evolutionary Biology 7(suppl.1) (2007)

    Google Scholar 

  24. Sayers, E.W., et al.: Database resources of the national center for biotechnology information. Nucleic Acids Research 37(suppl.1), D5–D15 (2009)

    Article  Google Scholar 

  25. Semple, C., Steel, M.A.: Phylogenetics. Oxford University Press (2003)

    Google Scholar 

  26. Smith, A.: Rooting molecular trees: problems and strategies. Biol. J. Linn. Soc. 51, 279–292

    Google Scholar 

  27. Strimmer, K., von Haeseler, A.: Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies. Molecular Biology and Evolution 13, 964–969 (1996)

    Article  Google Scholar 

  28. Wheeler, W.: Nucleic acid sequence phylogeny and random outgroups. Cladistics – The International Journal of the Willi Hennig Society 51, 363–368 (1990)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Górecki, P., Eulenstein, O. (2012). A Robinson-Foulds Measure to Compare Unrooted Trees with Rooted Trees. In: Bleris, L., Măndoiu, I., Schwartz, R., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2012. Lecture Notes in Computer Science(), vol 7292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30191-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30191-9_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30190-2

  • Online ISBN: 978-3-642-30191-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics