Abstract
One of evolutionary molecular biology fundamental issues is to discover genomic duplication events and their correspondence to the species tree. Such events can be reconstructed by clustering single gene duplications that are inferred by reconciling a set of gene trees with a species tree. Here we propose the first solution to the genomic duplication problem in which every reconciliation with the minimal number of single gene duplications is allowed and the method of clustering called minimum episodes under the assumption that input gene trees are unrooted. We also present an evaluation study of proposed algorithms on empirical datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In this article, the notion of the plateau is used exclusively with the duplication cost. In literature, it is often called D-plateau in order to distinguish between plateaus for other costs, e.g. DL-plateau [47].
References
Kellis, M., Birren, B.W., Lander, E.S.: Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617–624 (2004)
Guyot, R., Keller, B.: Ancestral genome duplication in rice. Genome 47(3), 610–614 (2004)
Vision, T.J., Brown, D.G., Tanksley, S.D.: The origins of genomic duplications in Arabidopsis. Science 290(5499), 2114–2117 (2000)
Costantino, L., Sotiriou, S.K., Rantala, J.K., Magin, S., et al.: Break-induced replication repair of damaged forks induces genomic duplications in human cells. Science 343(6166), 88–91 (2014)
Cui, L., Wall, P.K., Leebens-Mack, J.H., Lindsay, B.G., et al.: Widespread genome duplications throughout the history of flowering plants. Genome Res. 16(6), 738–749 (2006)
Aury, J.M., Jaillon, O., Duret, L., Noel, B., et al.: Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444(7116), 171–178 (2006)
Van de Peer, Y., Maere, S., Meyer, A.: The evolutionary significance of ancient genome duplications. Nat. Rev. Genet. 10(10), 725–732 (2009)
Vandepoele, K., Simillion, C., Van de Peer, Y.: Evidence that rice and other cereals are ancient aneuploids. Plant Cell. 15(9), 2192–2202 (2003)
Sato, S., Tabata, S., Hirakawa, H., Asamizu, E., et al.: The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485(7400), 635–641 (2012)
Scossa, F., Brotman, Y., de Abreu e Lima, F., et al.: Genomics-based strategies for the use of natural variation in the improvement of crop metabolism. Plant Sci. 242, 47–64 (2016)
Vanneste, K., Maere, S., Van de Peer, Y.: Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369(1648), 20130353 (2014)
Tang, H., Bowers, J.E., Wang, X., Ming, R., et al.: Synteny and collinearity in plant genomes. Science 320(5875), 486–488 (2008)
Holloway, P., Swenson, K., Ardell, D., El-Mabrouk, N.: Ancestral genome organization: an alignment approach. J. Comput. Biol. 20(4), 280–295 (2013)
Blanc, G., Wolfe, K.H.: Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16(7), 1667–78 (2004)
Bowers, J.E., Chapman, B.A., Rong, J., Paterson, A.H.: Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422(6930), 433–8 (2003)
Jiao, Y., Wickett, N.J., Ayyampalayam, S., Chanderbali, A.S., et al.: Ancestral polyploidy in seed plants and angiosperms. Nature 473(7345), 97–100 (2011)
Rabier, C.E., Ta, T., Ané, C.: Detecting and locating whole genome duplications on a phylogeny: a probabilistic approach. Mol. Biol. Evol. 31(3), 750–62 (2014)
Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)
Mirkin, B., Muchnik, I., Smith, T.F.: A biologically consistent model for comparing molecular phylogenies. J. Comput. Biol. 2(4), 493–507 (1995)
Guigó, R., Muchnik, I.B., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6(2), 189–213 (1996)
Arvestad, L., Berglund, A.C., Lagergren, J., Sennblad, B.: Bayesian gene/species tree reconciliation and orthology analysis using MCMC. Bioinformatics 19(Suppl 1), i7–15 (2003)
Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 347(1–2), 36–53 (2005)
Noutahi, E., Semeria, M., Lafond, M., Seguin, J., et al.: Efficient gene tree correction guided by genome evolution. PLoS ONE 11(8), 1–22 (2016)
Schmidt-Böcking, H., Reich, K., Templeton, A., Trageser, W., Vill, V.: Reconstructing a supergenetree minimizing. BMC Bioinf. 16(14), S4 (2015)
Dondi, R., Mauri, G., Zoppis, I.: Orthology correction for gene tree reconstruction: theoretical and experimental results. Proc. Comput. Sci. 108, 1115–1124 (2017)
Scornavacca, C., Jacox, E., Szöllősi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31(6), 841–848 (2014)
Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol. Evol. 28(12), 719–728 (2013)
Zhu, Y., Lin, Z., Nakhleh, L.: Evolution after whole-genome duplication: a network perspective. G3: Genes, Genomes. Genetics 3(11), 2049–2057 (2013)
Zheng, Y., Zhang, L.: Effect of incomplete lineage sorting on tree-reconciliation-based inference of gene duplication. IEEE/ACM Trans. Comput. Biol. Bioinf. 11(3), 477–485 (2014)
Duchemin, W., Anselmetti, Y., Patterson, M., Ponty, Y., et al.: DeCoSTAR: Reconstructing the ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol. Evol. 9(5), 1312–1319 (2017)
Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., et al.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28(2), 132–163 (1979)
Doyon, J.P., Chauve, C., Hamel, S.: Space of gene/species tree reconciliations and parsimonious models. J. Comput. Biol. 16(10), 1399–1418 (2009)
Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)
Stolzer, M., Lai, H., Xu, M., et al.: Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28(18), i409–i415 (2012)
Górecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theor. Comput. Sci. 359(1–3), 378–399 (2006)
Paszek, J., Górecki, P.: Genomic duplication problems for unrooted gene trees. BMC Genom. 17(1), 165–175 (2016)
Page, R.D.M., Cotton, J.A.: Vertebrate phylogenomics: reconciled trees and gene duplications. In: Pacific Symposium on Biocomputing, pp. 536–547 (2002)
Bansal, M.S., Eulenstein, O.: The multiple gene duplication problem revisited. Bioinformatics 24(13), i132–8 (2008)
Burleigh, J.G., Bansal, M.S., Wehe, A., Eulenstein, O.: Locating multiple gene duplications through reconciled trees. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS, vol. 4955, pp. 273–284. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78839-3_24
Mettanant, V., Fakcharoenphol, J.: A linear-time algorithm for the multiple gene duplication problem. NCSEC, pp. 198–203 (2008)
Luo, C.W., Chen, M.C., Chen, Y.C., Yang, R.W.L., et al.: Linear-time algorithms for the multiple gene duplication problems. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(1), 260–265 (2011)
Burleigh, J.G., Bansal, M.S., Eulenstein, O., Vision, T.J.: Inferring species trees from gene duplication episodes. ACM BCB, pp. 198–203 (2010)
Paszek, J., Górecki, P.: Efficient algorithms for genomic duplicationmodels; APBC 2017. IEEE/ACM Trans. Comput. Biol. Bioinform. doi:10.1109/TCBB.2017.2706679
Fellows, M., Hallett, M., Stege, U.: On the multiple gene duplication problem. In: Chwa, K.-Y., Ibarra, O.H. (eds.) ISAAC 1998. LNCS, vol. 1533, pp. 348–357. Springer, Heidelberg (1998). doi:10.1007/3-540-49381-6_37
Czabarka, E., Szkely, L., Vision, T.: Minimizing the number of episodes and Gallai’s theorem on intervals. arXiv:12095699;2012
Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e122 (2007)
Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 522–536 (2013)
Page, R.D.M., Charleston, M.A.: Reconciled trees and incongruent gene and species trees. Math. Hierarchies Biol. DIMACS 96 37, 57–70 (1997)
Acknowledgements
We would like to thank the reviewers for their detailed comments that allowed us to improve our paper. The support was provided by NCN grants #2015/19/N/ST6/01193 and #2015/19/B/ST6/00726.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Paszek, J., Górecki, P. (2017). New Algorithms for the Genomic Duplication Problem. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-67979-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67978-5
Online ISBN: 978-3-319-67979-2
eBook Packages: Computer ScienceComputer Science (R0)