Abstract
The recently developed Domain-Gene-Species (DGS) reconciliation framework, which jointly models the evolution of a domain family inside one or more gene families and the evolution of those gene families inside a species tree, represents one of the most powerful computational techniques for reconstructing detailed histories of domain and gene family evolution in eukaryotic species. However, the DGS reconciliation framework allows for the reconciliation of only a single domain tree (representing a single domain family present in one or more gene families from the species under consideration) at a time, i.e., each domain tree is reconciled separately without consideration of any other domain families that might be present in the gene trees under consideration. However, this can lead to conflicting gene-species reconciliations for gene trees containing multiple domain families.
In this work, we address this problem by extending the DGS reconciliation model to simultaneously reconcile a set of domain trees, a set of gene trees, and a species tree. The new model, which we call the multi-DGS (mDGS) reconciliation model, produces a consistent joint reconciliation showing the evolution of each domain tree in its corresponding gene trees and the evolution of each gene tree inside the species tree. We formalize the mDGS reconciliation framework and define the associated computational problem, provide a heuristic algorithm for estimating optimal mDGS reconciliations (both the DGS and mDGS reconciliation problems are NP-hard), and apply our algorithm to a large dataset of over 3800 domain trees and over 7100 gene trees from 12 fly species. Our analysis of this dataset reveals interesting underlying patterns of co-occurrence of domains and genes, demonstrates the importance of mDGS reconciliation, and shows that the proposed heuristic is effective at estimating optimal mDGS reconciliations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Behzadi, B., Vingron, M.: Reconstructing domain compositions of ancestral multi-domain proteins. In: Bourque, G., El-Mabrouk, N. (eds.) RCG 2006. LNCS, vol. 4205, pp. 1–10. Springer, Heidelberg (2006). https://doi.org/10.1007/11864127_1
Ekman, D., Björklund, Å.K., Frey-Skött, J., Elofsson, A.: Multi-domain proteins in the three kingdoms of life: orphan domains and other unassigned regions. J. Mol. Biol. 348(1), 231–243 (2005)
Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage. A parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28, 132–163 (1979)
Han, J.-H., Batey, S., Nickson, A.A., Teichmann, S.A., Clarke, J.: The folding and evolution of multidomain proteins. Nat. Rev. Mol. Cell Biol. 8, 319–330 (2007)
Kundu, S., Bansal, M.S.: SaGePhy: an improved phylogenetic simulation framework for gene and subgene evolution. Bioinformatics (2019, in press)
Li, L., Bansal, M.S.: An integer linear programming solution for the domain-gene-species reconciliation problem. In: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2018, pp. 386–397. ACM, New York (2018)
Li, L., Bansal, M.S.: An integrated reconciliation framework for domain, gene, and species level evolution. IEEE/ACM Trans. Comput. Biol. Bioinform. 16(1), 63–76 (2019)
Moore, A.D., Bjorklund, A.K., Ekman, D., Bornberg-Bauer, E., Elofsson, A.: Arrangements in the modular evolution of proteins. Trends Biochem. Sci. 33, 444–451 (2008)
Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)
Stolzer, M., Siewert, K., Lai, H., Xu, M., Durand, D.: Event inference in multidomain families with phylogenetic reconciliation. BMC Bioinform. 16(14), S8 (2015)
Tordai, H., Nagy, A., Farkas, K., Banyai, L., Patthy, L.: Modules, multidomain proteins and organismic complexity. FEBS J. 272(19), 5064–5078 (2005)
Vogel, C., Bashton, M., Kerrison, N.D., Chothia, C., Teichmann, S.A.: Structure, function and evolution of multidomain proteins. Curr. Opin. Struct. Biol. 14(2), 208–216 (2004)
Wiedenhoeft, J., Krause, R., Eulenstein, O.: The plexus model for the inference of ancestral multidomain proteins. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(4), 890–901 (2011)
Wu, Y.-C., Bansal, M.S., Rasmussen, M.D., Herrero, J., Kellis, M.: Phylogenetic identification and functional characterization of orthologs and paralogs across human, mouse, fly, and worm. bioRxiv (2014)
Wu, Y.-C., Rasmussen, M.D., Kellis, M.: Evolution at the subgene level: domain rearrangements in the drosophila phylogeny. Mol. Biol. Evol. 29(2), 689–705 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, L., Bansal, M.S. (2019). Simultaneous Multi-Domain-Multi-Gene Reconciliation Under the Domain-Gene-Species Reconciliation Model. In: Cai, Z., Skums, P., Li, M. (eds) Bioinformatics Research and Applications. ISBRA 2019. Lecture Notes in Computer Science(), vol 11490. Springer, Cham. https://doi.org/10.1007/978-3-030-20242-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-20242-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20241-5
Online ISBN: 978-3-030-20242-2
eBook Packages: Computer ScienceComputer Science (R0)