Abstract
Prokaryotes are a rich source of versatile molecular functional systems that typically consist of multiple, interacting proteins. The study of such systems leads to fundamental biological discoveries, for example, understanding of the origins of innate and adaptive immunity in animals and also provides for the development of various biotechnology applications. The discovery of functional systems by microbial genome mining is facilitated by the fact that functionally coupled genes in bacterial and archaeal genomes often cluster in operons that are conserved across long evolutionary spans. However, accurate differentiation of operons from spurious gene clusters by genome comparison is a non-trivial task that depends on an underlying model of neutral genomes evolution. Here, we investigate the predictions of a gene clustering based on a recently developed stochastic model of genome rearrangement arising from horizontal gene transfer between evolving species along a phylogenetic tree. We focus on synteny blocks, that is, strings of genes conserved across genomes and derive analytic expressions for the expected number of synteny blocks of a given size (or of maximal size) in terms of the temporal separation between the genomes and the rates of evolutionary events. Our setting is similar to the heavily studied stick breaking problem family, but its discrete structure and the stochastic nature of the underlying process suggest a simple, independent model. We demonstrate the predictive power of this model both in simulations and on real data from the ATGC data base.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adato, O., Ninyo, N., Gophna, U., Snir, S.: Detecting horizontal gene transfer between closely related taxa. PLoS Comput. Biol. 11, e1004408 (2015)
Bafna, V., Pevzner, P.A.: Genome rearrangements and sorting by reversals. SIAM J. Comput. 25(2), 272–289 (1996)
Bejerano, G., et al.: Ultraconserved elements in the human genome. Science 304(5675), 1321–5 (2004)
Biller, P., Guéguen, L., Tannier, E.: Moments of genome evolution by double cut-and-join. BMC Bioinform. 16(14), S7 (2015)
Doolittle, W.: Lateral genomics. Trends Cell Biol. 9, M5–M8 (1999)
Gogarten, J., Doolittle, W., Lawrence, J.: Prokaryotic evolution in light of gene transfer. Mol. Biol. Evol. 19, 2226–2238 (2002)
Grimmett, G., Stirzaker, D.: Probability and Random Processes, 3rd edn. Oxford University Press, Oxford (2001)
Holst, L.: On the lengths of the pieces of a stick broken at random. J. Appl. Probab. 17(3), 623–634 (1980)
Huson, D.H., Steel, M.: Phylogenetic trees based on gene content. Bioinformatics 20(13), 2044–2049 (2004)
Katriel, G., et al.: Gene transfer-based phylogenetics: analytical expressions and additivity via birth-death theory. Syst. Biol. 72(6), 1403–1417 (2023)
Koonin, E.V., Wolf, Y.I.: Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res. 36(21), 6688–6719 (2008)
Korbel, J.O., Jensen, L.J., von Mering, C., Bork, P.: Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat. Biotechnol. 22(7), 911–917 (2004)
Kristensen, D.M., Wolf, Y.I., Koonin, E.V.: ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation. Nucleic Acids Res. 45(D1), D210–D218 (2017)
Kullback, S., Leibler, R.: On information and sufficiency. The Ann. Math. Stat. 22(1), 79–86 (1951)
Lawrence, J.: Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes. Curr. Opin. Genet. Dev. 9, 642–648 (1999)
Libeskind-Hadas, R., Wu, Y.-C., Bansal, M.S., Kellis, M.: Pareto-optimal phylogenetic tree reconciliation. Bioinformatics 30(12), i87–i95 (2014)
Malke, H.: J. H. Miller and W. S. Reznikoff (editors), the operon (2nd edition). vii, 469 s., 128 abb., 36 tab. cold spring harbor 1980. cold spring harbor laboratory. Zeitschrift für allgemeine Mikrobiologie 21(9), 697–697 (1981)
Mathis, F.H.: A generalized birthday problem. SIAM Rev. 33(2), 265–270 (1991)
Mushegian, A., Koonin, E.: Gene order is not conserved in bacterial evolution. Trends Genet. 12, 289–290 (1996)
Nadeau, J.H., Taylor, B.A.: Lengths of chromosomal segments conserved since divergence of man and mouse. Proc. Natl. Acad. Sci. 81(3), 814–818 (1984)
Novichkov, P.S., Ratnere, I., Wolf, Y.I., Koonin, E.V., Dubchak, I.: ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes. Nucleic Acids Res. 37, D448-454 (2009)
Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5(3), 555–570 (1998)
Sankoff, D., El-Mabrouk, N.: Genome rearrangement. In: Jiang, T., Xu, Y., Zhang, M. (eds.) Current Topics in Computational Molecular Biology. CRC Press (2002)
Sankoff, D., Leduc, G., Antoine, N., Paquin, B., Lang, B.F., Cedergren, R.: Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. Proc. Natl. Acad. Sci. 89(14), 6575–6579 (1992)
Sankoff, D., Nadeau, J.H.: Conserved synteny as a measure of genomic distance. Discrete Appl. Math. 71(1–3), 247–257 (1996)
Serdoz, S., et al.: Maximum likelihood estimates of pairwise rearrangement distances. J. Theor. Biol. 423, 31–40 (2017)
Setubal, J.C., Almeida, N.F., Wattam, A.R.: Comparative genomics for prokaryotes. Methods Mol. Biol. 1704, 55–78 (2018)
Sevillya, G., Doerr, D., Lerner, Y., Stoye, J., Steel, M., Snir, S.: Horizontal gene transfer phylogenetics: a random walk approach. Mol. Biol. Evol. 37(5), 1470–1479 (2019)
Shifman, A., Ninyo, N., Gophna, U., Snir, S.: Phylo SI: a new genome-wide approach for prokaryotic phylogeny. Nucleic Acids Res. 42(4), 2391–2404 (2013)
Sjöstrand, J., Tofigh, A., Daubin, V., Arvestad, L., Sennblad, B., Lagergren, J.: A Bayesian method for analyzing lateral gene transfer. Syst. Biol. 63(3), 409–420 (2014)
Snel, B., Bork, P., Huynen, M.A.: Genomes in flux: the evolution of archaeal and proteobacterial gene content. Genome Res. 12(1), 17–25 (2002)
Stolzer, M., Lai, H., Xu, M., Sathaye, D., Vernot, B., Durand, D.: Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28(18), i409–i415 (2012)
Szöllősi, G.J., Tannier, E., Lartillot, N., Daubin, V.: Lateral gene transfer from the dead. Syst. Biol. 62(3), 386–397 (2013)
Teichmann, S.A., Babu, M.M.: Conservation of gene co-regulation in prokaryotes and eukaryotes. Trends Biotechnol. 20(10), 407–410 (2002)
Verreault, W.: MacMahon partition analysis: a discrete approach to broken stick problems. J. Comb. Theory Ser. A 187, 105571 (2022)
Wang, L.-S., Warnow, T.: Estimating true evolutionary distances between genomes. In: Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, pp. 637–646. ACM (2001)
Wolf, Y.I., Makarova, K.S., Lobkovsky, A.E., Koonin, E.V.: Two fundamentally different classes of microbial genes. Nat. Microbiol. 2, 16208 (2016)
Wolf, Y.I., Rogozin, I.B., Kondrashov, A.S., Koonin, E.V.: Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context. Genome Res. 11(3), 356–372 (2001)
Woodhams, M., Steane, D.A., Jones, R.C., Nicolle, D., Moulton, V., Holland, B.R.: Novel distances for Dollo data. Syst. Biol. 62(1), 62–77 (2012)
Zhaxybayeva, O., Gogarten, J.P., Charlebois, R.L., Doolittle, W.F., Papke, R.T.: Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genome Res. 16(9), 1099–1108 (2006)
Acknowledgments
We wish to thank the reviewers and in particular Reviewer 1 for his very meticulous examination and enlightening comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Snir, S., Wolf, Y., Brezner, S., Koonin, E., Steel, M. (2024). On the Distribution of Synteny Blocks Under a Neutral Model of Genome Dynamics. In: Scornavacca, C., Hernández-Rosales, M. (eds) Comparative Genomics. RECOMB-CG 2024. Lecture Notes in Computer Science(), vol 14616. Springer, Cham. https://doi.org/10.1007/978-3-031-58072-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-58072-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58071-0
Online ISBN: 978-3-031-58072-7
eBook Packages: Computer ScienceComputer Science (R0)