Skip to main content

The Asymmetric Cluster Affinity Cost

  • Conference paper
  • First Online:
Comparative Genomics (RECOMB-CG 2023)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 13883))

Included in the following conference series:

Abstract

Tree comparison costs are sophisticated tools used to compare the results of different phylogenetic hypotheses and reconstruction methods and to evaluate the robustness of a tree to data perturbations. The Robinson-Foulds distance is a widely used measure for comparing the topologies of two trees, but it is highly sensitive to tree error. Consequently, tree differences may be over-estimated, leading to incorrect inference. An approach to overcome this shortcoming is the Cluster Affinity distance, which is a refinement of the Robinson-Foulds distance. These distances are symmetric and thus designed to compare the same type of trees. However, it is common to compare different types of trees, such as gene trees compared with species trees, or the integration of different datasets into a supertree: these comparisons are inherently asymmetric. Here, we introduce the asymmetric Cluster Affinity cost, a relaxation of the original Affinity cost to compare heterogeneous trees. We demonstrate that the characteristics of this cost are similar to the symmetric Cluster Affinity distance. Further, for the asymmetric affinity cost we describe efficient algorithms, derive the exact diameters, and use these to standardize the cost to be applicable in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Comb. 5, 1–15 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bininda-Emonds, O.R.: Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, vol. 4. Springer, Dordrecht (2004). https://doi.org/10.1007/978-1-4020-2330-9

  3. Bogdanowicz, D., Giaro, K.: Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(1), 150–160 (2011)

    Article  Google Scholar 

  4. Bogdanowicz, D., Giaro, K.: On a matching distance between rooted phylogenetic trees. Int. J. Appl. Math. Comput. Sci. 23(3), 669–684 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bogdanowicz, D., Giaro, K.: Comparing phylogenetic trees by matching nodes using the transfer distance between partitions. J. Comput. Biol. 24(5), 422–435 (2017)

    Article  MathSciNet  Google Scholar 

  6. Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Comb. 8, 409–423 (2005). https://doi.org/10.1007/s00026-004-0229-z

    Article  MathSciNet  MATH  Google Scholar 

  7. Chaudhary, R., Burleigh, J.G., Eulenstein, O.: Efficient error correction algorithms for gene tree reconciliation based on duplication, duplication and loss, and deep coalescence. BMC Bioinform. 13, 1–10 (2012)

    Google Scholar 

  8. Estabrook, G.F., McMorris, F., Meacham, C.A.: Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst. Zool. 34(2), 193–200 (1985)

    Article  Google Scholar 

  9. Giardina, F., Romero-Severson, E.O., Albert, J., Britton, T., Leitner, T.: Inference of transmission network structure from HIV phylogenetic trees. PLoS Comput. Biol. 13(1), e1005316 (2017)

    Article  Google Scholar 

  10. Kulkarni, A., Sabetpour, N., Markin, A., Eulenstein, O., Li, Q.: CPTAM: constituency parse tree aggregation method. In: SDM (2022)

    Google Scholar 

  11. Lin, Y., Rajan, V., Moret, B.M.: A metric for phylogenetic trees based on matching. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(4), 1014–1022 (2011)

    Article  Google Scholar 

  12. Lozano-Fernandez, J.: A practical guide to design and assess a phylogenomic study. Genome Biol. Evol. 14(9), evac129 (2022)

    Google Scholar 

  13. Moon, J., Eulenstein, O.: The cluster affinity distance for phylogenies. In: Cai, Z., Skums, P., Li, M. (eds.) ISBRA 2019. LNCS, vol. 11490, pp. 52–64. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20242-2_5

    Chapter  Google Scholar 

  14. Page, R.D.M.: Modified mincut supertrees. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 537–551. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45784-4_41

    Chapter  Google Scholar 

  15. Prum, R.O., et al.: A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing. Nature 526(7574), 569–573 (2015)

    Article  Google Scholar 

  16. Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  17. Russo, C., Takezaki, N., Nei, M.: Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny. Mol. Biol. Evol. 13(3), 525–536 (1996)

    Article  Google Scholar 

  18. Shen, X.X., Steenwyk, J.L., Rokas, A.: Dissecting incongruence between concatenation-and quartet-based approaches in phylogenomic data. Syst. Biol. 70(5), 997–1014 (2021)

    Article  Google Scholar 

  19. Smith, M.R.: Information theoretic generalized Robinson-Foulds metrics for comparing phylogenetic trees. Bioinformatics 36(20), 5007–5013 (2020)

    Article  Google Scholar 

  20. Steel, M.A., Penny, D.: Distributions of tree comparison metrics-some new results. Syst. Biol. 42(2), 126–141 (1993)

    Google Scholar 

  21. Swenson, M.S., Suri, R., Linder, C.R., Warnow, T.: An experimental study of quartets MaxCut and other supertree methods. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 288–299. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15294-8_24

    Chapter  Google Scholar 

  22. Waterman, M.S., Smith, T.F.: On the similarity of dendrograms. J. Theor. Biol. 73(4), 789–800 (1978)

    Article  MathSciNet  Google Scholar 

  23. Wickett, N.J., et al.: Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc. Natl. Acad. Sci. 111(45), E4859–E4868 (2014)

    Article  Google Scholar 

  24. Yang, Z., Rannala, B.: Molecular phylogenetics: principles and practice. Nat. Rev. Genet. 13(5), 303–314 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

We thank the reviewers for their constructive and valuable comments. This work was supported in part by the U.S. Department of Agriculture (USDA) Agricultural Research Service (ARS project number 5030-32000-231-000-D, and 5030-32000-231-095-S). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. USDA is an equal opportunity provider and employer. PG was supported by the grant of National Science Centre 2017/27/B/ST6/02720.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sanket Wagle or Oliver Eulenstein .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wagle, S., Markin, A., Górecki, P., Anderson, T., Eulenstein, O. (2023). The Asymmetric Cluster Affinity Cost. In: Jahn, K., Vinař, T. (eds) Comparative Genomics. RECOMB-CG 2023. Lecture Notes in Computer Science(), vol 13883. Springer, Cham. https://doi.org/10.1007/978-3-031-36911-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36911-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36910-0

  • Online ISBN: 978-3-031-36911-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics