Skip to main content

Reducing Multi-state to Binary Perfect Phylogeny with Applications to Missing, Removable, Inserted, and Deleted Data

  • Conference paper
Algorithms in Bioinformatics (WABI 2010)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6293))

Included in the following conference series:

Abstract

Multi-State Perfect Phylogeny is an extension of Binary Perfect Phylogeny where characters are allowed more than two states. In this paper we consider four problems that extend its utility: In the Missing Data (MD) Problem some entries in the input are missing and the question is whether (bounded) values can be imputed so that the resulting data has a multi-state Perfect Phylogeny; In the Character-Removal (CR) Problem we want to minimize the number of characters to remove from the data so that the resulting data has a multi-state Perfect Phylogeny; In the Missing-Data Character-Removal (MDCR) Problem we want to impute values for the missing data to minimize the solution to the resulting Character-Removal Problem; In the Insertion and Deletion (ID) Problem insertion and deletion mutational events spanning multiple characters are also allowed.

In this paper, we introduce a new general conceptual solution to these four problems. The method reduces k-state problems to binary problems with missing data. This gives a new conceptual solution to the multi-state Perfect Phylogeny problem, and conceptual solutions to the MD, CR, MDCR and ID problems for any k significantly improving previous work. Empirical evaluations of our implementations show that they are faster and effective for larger input than previously established methods for general k.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwala, R., Fernandez-Baca, D.: A polynomial-time algorithm for the perfect phylogeny problem when the number of character states is fixed. SIAM Journal on Computing 23(6), 1216–1224 (1994)

    Article  Google Scholar 

  2. Alekseyenko, A.V., Lee, C.J., Suchard, M.A.: Wagner and dollo: a stochastic duet by composing two parsimonious solos. Syst. Biol. 57(5), 772–784 (2008)

    Article  PubMed  PubMed Central  Google Scholar 

  3. Buneman, P.: The recovery of trees from measures of dissimilarity. Mathematics in the archaeological and historical sciences, 387–395 (1971)

    Google Scholar 

  4. Fernández-Baca, D.: The perfect phylogeny problem. In: Du, D.Z., Cheng, X. (eds.) Steiner Trees in Industries. Kluwer Academic Publishers, Dordrecht (2001)

    Google Scholar 

  5. Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks 21(1), 19–28 (1991)

    Article  Google Scholar 

  6. Gusfield, D.: The multi-state perfect phylogeny problem with missing and removable data: Solutions via integer-programming and chordal graph theory. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 236–252. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  7. Gusfield, D., Frid, Y., Brown, D.: Integer Programming Formulations and Computations Solving Phylogenetic and Population Genetic Problems with Missing or Genotypic Data. In: Lin, G. (ed.) COCOON 2007. LNCS, vol. 4598, p. 51. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Gysel, R., Gusfield, D.: Extensions and Improvements to the Chordal Graph Approach to the Multi-state Perfect Phylogeny Problem. In: Borodovsky, M., Gogarten, J.P., Przytycka, T.M., Rajasekaran, S. (eds.) Bioinformatics Research and Applications. LNCS, vol. 6053, pp. 52–60. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Halperin, E., Karp, R.: Perfect phylogeny and haplotype assignment. In: Proceedings of the eighth annual international conference on Resaerch in computational molecular biology, pp. 10–19. ACM, New York (2004)

    Google Scholar 

  10. Hudson, R.: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)

    Article  CAS  PubMed  Google Scholar 

  11. Kannan, S., Warnow, T.: Inferring evolutionary history from DNA sequences. In: Proceedings of 31st Annual Symposium on Foundations of Computer Science, pp. 362–371 (1990)

    Google Scholar 

  12. Kannan, S., Warnow, T.: A fast algorithm for the computation and enumeration of perfect phylogenies when the number of character states is fixed. In: Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms, pp. 595–603. Society for Industrial and Applied Mathematics, Philadelphia (1995)

    Google Scholar 

  13. Lloyd, D.: Multi-residue gaps, a class of molecular characters with exceptional reliability for phylogenetic analyses. Journal of Evolutionary Biology 4(1), 9–21 (2002)

    Article  Google Scholar 

  14. Pe’er, I., Pupko, T., Shamir, R., Sharan, R.: Incomplete directed perfect phylogeny. SIAM Journal on Computing 33(3), 590–607 (2004)

    Article  Google Scholar 

  15. Satya, R., Mukherjee, A.: The undirected incomplete perfect phylogeny problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics 5(4), 618–629 (2008)

    Article  PubMed  Google Scholar 

  16. Semple, C., Steel, M.: Phylogenetics. Oxford University Press, USA (2003)

    Google Scholar 

  17. Simmons, M., Ochoterena, H.: Gaps as characters in sequence-based phylogenetic analyses. Systematic Biology 49(2), 369–381 (2000)

    Article  CAS  PubMed  Google Scholar 

  18. Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification 9(1), 91–116 (1992)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Stevens, K., Gusfield, D. (2010). Reducing Multi-state to Binary Perfect Phylogeny with Applications to Missing, Removable, Inserted, and Deleted Data. In: Moulton, V., Singh, M. (eds) Algorithms in Bioinformatics. WABI 2010. Lecture Notes in Computer Science(), vol 6293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15294-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15294-8_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15293-1

  • Online ISBN: 978-3-642-15294-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics