Skip to main content

New Algorithms for the Genomic Duplication Problem

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10562))

Abstract

One of evolutionary molecular biology fundamental issues is to discover genomic duplication events and their correspondence to the species tree. Such events can be reconstructed by clustering single gene duplications that are inferred by reconciling a set of gene trees with a species tree. Here we propose the first solution to the genomic duplication problem in which every reconciliation with the minimal number of single gene duplications is allowed and the method of clustering called minimum episodes under the assumption that input gene trees are unrooted. We also present an evaluation study of proposed algorithms on empirical datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In this article, the notion of the plateau is used exclusively with the duplication cost. In literature, it is often called D-plateau in order to distinguish between plateaus for other costs, e.g. DL-plateau [47].

References

  1. Kellis, M., Birren, B.W., Lander, E.S.: Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617–624 (2004)

    Article  Google Scholar 

  2. Guyot, R., Keller, B.: Ancestral genome duplication in rice. Genome 47(3), 610–614 (2004)

    Article  Google Scholar 

  3. Vision, T.J., Brown, D.G., Tanksley, S.D.: The origins of genomic duplications in Arabidopsis. Science 290(5499), 2114–2117 (2000)

    Article  Google Scholar 

  4. Costantino, L., Sotiriou, S.K., Rantala, J.K., Magin, S., et al.: Break-induced replication repair of damaged forks induces genomic duplications in human cells. Science 343(6166), 88–91 (2014)

    Article  Google Scholar 

  5. Cui, L., Wall, P.K., Leebens-Mack, J.H., Lindsay, B.G., et al.: Widespread genome duplications throughout the history of flowering plants. Genome Res. 16(6), 738–749 (2006)

    Article  Google Scholar 

  6. Aury, J.M., Jaillon, O., Duret, L., Noel, B., et al.: Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444(7116), 171–178 (2006)

    Article  Google Scholar 

  7. Van de Peer, Y., Maere, S., Meyer, A.: The evolutionary significance of ancient genome duplications. Nat. Rev. Genet. 10(10), 725–732 (2009)

    Article  Google Scholar 

  8. Vandepoele, K., Simillion, C., Van de Peer, Y.: Evidence that rice and other cereals are ancient aneuploids. Plant Cell. 15(9), 2192–2202 (2003)

    Article  Google Scholar 

  9. Sato, S., Tabata, S., Hirakawa, H., Asamizu, E., et al.: The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485(7400), 635–641 (2012)

    Article  Google Scholar 

  10. Scossa, F., Brotman, Y., de Abreu e Lima, F., et al.: Genomics-based strategies for the use of natural variation in the improvement of crop metabolism. Plant Sci. 242, 47–64 (2016)

    Article  Google Scholar 

  11. Vanneste, K., Maere, S., Van de Peer, Y.: Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369(1648), 20130353 (2014)

    Article  Google Scholar 

  12. Tang, H., Bowers, J.E., Wang, X., Ming, R., et al.: Synteny and collinearity in plant genomes. Science 320(5875), 486–488 (2008)

    Article  Google Scholar 

  13. Holloway, P., Swenson, K., Ardell, D., El-Mabrouk, N.: Ancestral genome organization: an alignment approach. J. Comput. Biol. 20(4), 280–295 (2013)

    Article  MathSciNet  Google Scholar 

  14. Blanc, G., Wolfe, K.H.: Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16(7), 1667–78 (2004)

    Article  Google Scholar 

  15. Bowers, J.E., Chapman, B.A., Rong, J., Paterson, A.H.: Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422(6930), 433–8 (2003)

    Article  Google Scholar 

  16. Jiao, Y., Wickett, N.J., Ayyampalayam, S., Chanderbali, A.S., et al.: Ancestral polyploidy in seed plants and angiosperms. Nature 473(7345), 97–100 (2011)

    Article  Google Scholar 

  17. Rabier, C.E., Ta, T., Ané, C.: Detecting and locating whole genome duplications on a phylogeny: a probabilistic approach. Mol. Biol. Evol. 31(3), 750–62 (2014)

    Article  Google Scholar 

  18. Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)

    Google Scholar 

  19. Mirkin, B., Muchnik, I., Smith, T.F.: A biologically consistent model for comparing molecular phylogenies. J. Comput. Biol. 2(4), 493–507 (1995)

    Article  Google Scholar 

  20. Guigó, R., Muchnik, I.B., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6(2), 189–213 (1996)

    Article  Google Scholar 

  21. Arvestad, L., Berglund, A.C., Lagergren, J., Sennblad, B.: Bayesian gene/species tree reconciliation and orthology analysis using MCMC. Bioinformatics 19(Suppl 1), i7–15 (2003)

    Article  Google Scholar 

  22. Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 347(1–2), 36–53 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  23. Noutahi, E., Semeria, M., Lafond, M., Seguin, J., et al.: Efficient gene tree correction guided by genome evolution. PLoS ONE 11(8), 1–22 (2016)

    Article  Google Scholar 

  24. Schmidt-Böcking, H., Reich, K., Templeton, A., Trageser, W., Vill, V.: Reconstructing a supergenetree minimizing. BMC Bioinf. 16(14), S4 (2015)

    Google Scholar 

  25. Dondi, R., Mauri, G., Zoppis, I.: Orthology correction for gene tree reconstruction: theoretical and experimental results. Proc. Comput. Sci. 108, 1115–1124 (2017)

    Article  Google Scholar 

  26. Scornavacca, C., Jacox, E., Szöllősi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31(6), 841–848 (2014)

    Article  Google Scholar 

  27. Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol. Evol. 28(12), 719–728 (2013)

    Article  Google Scholar 

  28. Zhu, Y., Lin, Z., Nakhleh, L.: Evolution after whole-genome duplication: a network perspective. G3: Genes, Genomes. Genetics 3(11), 2049–2057 (2013)

    Google Scholar 

  29. Zheng, Y., Zhang, L.: Effect of incomplete lineage sorting on tree-reconciliation-based inference of gene duplication. IEEE/ACM Trans. Comput. Biol. Bioinf. 11(3), 477–485 (2014)

    Article  MathSciNet  Google Scholar 

  30. Duchemin, W., Anselmetti, Y., Patterson, M., Ponty, Y., et al.: DeCoSTAR: Reconstructing the ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol. Evol. 9(5), 1312–1319 (2017)

    Article  Google Scholar 

  31. Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., et al.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28(2), 132–163 (1979)

    Article  Google Scholar 

  32. Doyon, J.P., Chauve, C., Hamel, S.: Space of gene/species tree reconciliations and parsimonious models. J. Comput. Biol. 16(10), 1399–1418 (2009)

    Article  MathSciNet  Google Scholar 

  33. Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  34. Stolzer, M., Lai, H., Xu, M., et al.: Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28(18), i409–i415 (2012)

    Article  Google Scholar 

  35. Górecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theor. Comput. Sci. 359(1–3), 378–399 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  36. Paszek, J., Górecki, P.: Genomic duplication problems for unrooted gene trees. BMC Genom. 17(1), 165–175 (2016)

    Article  Google Scholar 

  37. Page, R.D.M., Cotton, J.A.: Vertebrate phylogenomics: reconciled trees and gene duplications. In: Pacific Symposium on Biocomputing, pp. 536–547 (2002)

    Google Scholar 

  38. Bansal, M.S., Eulenstein, O.: The multiple gene duplication problem revisited. Bioinformatics 24(13), i132–8 (2008)

    Article  Google Scholar 

  39. Burleigh, J.G., Bansal, M.S., Wehe, A., Eulenstein, O.: Locating multiple gene duplications through reconciled trees. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS, vol. 4955, pp. 273–284. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78839-3_24

    Chapter  Google Scholar 

  40. Mettanant, V., Fakcharoenphol, J.: A linear-time algorithm for the multiple gene duplication problem. NCSEC, pp. 198–203 (2008)

    Google Scholar 

  41. Luo, C.W., Chen, M.C., Chen, Y.C., Yang, R.W.L., et al.: Linear-time algorithms for the multiple gene duplication problems. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(1), 260–265 (2011)

    Article  Google Scholar 

  42. Burleigh, J.G., Bansal, M.S., Eulenstein, O., Vision, T.J.: Inferring species trees from gene duplication episodes. ACM BCB, pp. 198–203 (2010)

    Google Scholar 

  43. Paszek, J., Górecki, P.: Efficient algorithms for genomic duplicationmodels; APBC 2017. IEEE/ACM Trans. Comput. Biol. Bioinform. doi:10.1109/TCBB.2017.2706679

  44. Fellows, M., Hallett, M., Stege, U.: On the multiple gene duplication problem. In: Chwa, K.-Y., Ibarra, O.H. (eds.) ISAAC 1998. LNCS, vol. 1533, pp. 348–357. Springer, Heidelberg (1998). doi:10.1007/3-540-49381-6_37

    Chapter  Google Scholar 

  45. Czabarka, E., Szkely, L., Vision, T.: Minimizing the number of episodes and Gallai’s theorem on intervals. arXiv:12095699;2012

  46. Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e122 (2007)

    Article  Google Scholar 

  47. Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 522–536 (2013)

    Article  Google Scholar 

  48. Page, R.D.M., Charleston, M.A.: Reconciled trees and incongruent gene and species trees. Math. Hierarchies Biol. DIMACS 96 37, 57–70 (1997)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank the reviewers for their detailed comments that allowed us to improve our paper. The support was provided by NCN grants #2015/19/N/ST6/01193 and #2015/19/B/ST6/00726.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jarosław Paszek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Paszek, J., Górecki, P. (2017). New Algorithms for the Genomic Duplication Problem. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67979-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67978-5

  • Online ISBN: 978-3-319-67979-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics