Skip to main content

Algorithms to Distinguish the Role of Gene-Conversion from Single-Crossover Recombination in the Derivation of SNP Sequences in Populations

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Abstract

Meiotic recombination is a fundamental biological event and one of the principal evolutionary forces responsible for shaping genetic variation within species. In addition to its fundamental role, recombination is central to several critical applied problems. The most important example is “association mapping” in populations, which is widely hoped to help find genes that influence genetic diseases [3, 4]. Hence, a great deal of recent attention has focused on problems of inferring the historical derivation of sequences in populations when both mutations and recombinations have occurred. In the algorithms literature, most of that recent work has been directed to single-crossover recombination. However, gene-conversion is an important, and more common, form of (two-crossover) recombination which has been much less investigated in the algorithms literature.

In this paper we explicitly incorporate gene-conversion into discrete methods to study historical recombination. We are concerned with algorithms for identifying and locating the extent of historical crossing-over and gene-conversion (along with single-nucleotide mutation), and problems of constructing full putative histories of those events. The novel technical issues concern the incorporation of gene-conversion into recently developed discrete methods [20, 26] that compute lower and upper-bound information on the amount of needed recombination without gene-conversion. We first examine the most natural extension of the lower bound methods from [20], showing that the extension can be computed efficiently, but that this extension can only yield weak lower bounds. We then develop additional ideas that lead to higher lower bounds, and show how to solve, via integer-linear programming, a more biologically realistic version of the lower bound problem. We also show how to compute effective upper bounds on the number of needed single-crossovers and gene-conversions, along with explicit networks showing a putative history of mutations, single-crossovers and gene-conversions.

We validate the significance of these methods by showing that they can be effectively used to distinguish simulation-derived sequences generated without gene-conversion from sequences that were generated with gene-conversion. We apply the methods to recently studied sequences of Arabidopsis thaliana, identifying many more regions in the sequences than were previously identified [22], where gene-conversion may have played a significant role. Demonstration software is available at www.cs. ucdavis.edu/~gusfield.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bafna, V., Bansal, V.: The number of recombination events in a sample history: conflict graph and lower bounds. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1, 78–90 (2004)

    Article  Google Scholar 

  2. Bafna, V., Bansal, V.: Improved recombination lower bounds for haplotype data. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 569–584. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Carlson, C., Eberle, M., Kruglyak, L., Nickerson, D.: Mapping complex disease loci in whole-genome association studies. Nature 429, 446–452 (2004)

    Article  Google Scholar 

  4. Clark, A.G.: Finding genes underlying risk of complex disease by linkage disequilibrium mapping. Curr. Opin. Genet. Dev. 13, 296–302 (2003)

    Article  Google Scholar 

  5. Drouin, G., Prat, F., Ell, M., Clarke, G.D.: Detecting and characterizing gene conversion between multigene family members. Mol. Bio. Evol. 16, 1369–1390 (1999)

    Google Scholar 

  6. El-Mabrouk, N.: Deriving haplotypes through recombination and gene conversion pathways. J. Bioinformatics and Computational Biology 2(2), 241–256 (2004)

    Article  Google Scholar 

  7. Fearnhead, P., Harding, R.M., Schneider, J.A., Myers, S., Donnelly, P.: Application of coalescent methods to reveal fine scale rate variation and recombination hotspots. Genetics 167, 2067–2081 (2004)

    Article  Google Scholar 

  8. Frisse, L., Hudson, R.R., Bartoszewicz, A., Wall, J.D., Donfack, J., Di Rienzo, A.: Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69, 831–843 (2001)

    Article  Google Scholar 

  9. Gusfield, D.: Optimal, efficient reconstruction of Root-Unknown phylogenetic networks with constrained and structured recombination. JCSS 70, 381–398 (2005)

    MATH  MathSciNet  Google Scholar 

  10. Gusfield, D., Eddhu, S., Langley, C.: The fine structure of galls in phylogenetic networks. INFORMS J. on Computing, special issue on Computational Biology 16, 459–469 (2004)

    MathSciNet  Google Scholar 

  11. Gusfield, D., Eddhu, S., Langley, C.: Optimal, efficient reconstruction of phylogenetic networks with constrained recombination. J. Bioinformatics and Computational Biology 2(1), 173–213 (2004)

    Article  Google Scholar 

  12. Gusfield, D., Hickerson, D., Eddhu, S.: An efficiently-computed lower bound on the number of recombinations in phylogenetic networks: Theory and empirical study. Discrete Applied Math, special issue on Computational Biology (to appear)

    Google Scholar 

  13. Hein, J.: Reconstructing evolution of sequences subject to recombination using parsimony. Math. Biosci. 98, 185–200 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hein, J.: A heuristic method to reconstruct the history of sequences subject to recombination. J. Mol. Evol. 36, 396–405 (1993)

    Article  Google Scholar 

  15. Hein, J., Schierup, M., Wiuf, C.: Gene Genealogies, Variation and Evolution: A primer in coalescent theory. Oxford University Press, UK (2004)

    Google Scholar 

  16. Hudson, R.: Generating samples under the Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)

    Article  Google Scholar 

  17. Hudson, R., Kaplan, N.: Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985)

    Google Scholar 

  18. Jeffreys, A.J., May, C.A.: Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nature Genetics 36, 151–156 (2004)

    Article  Google Scholar 

  19. Lajoie, M., El-Mabrouk, N.: Recovering haplotype structure through recombination and gene conversion. Bioinformatics 21(Suppl. 2), ii173–ii179 (2005)

    Google Scholar 

  20. Myers, S.R., Griffiths, R.C.: Bounds on the minimum number of recombination events in a sample history. Genetics 163, 375–394 (2003)

    Google Scholar 

  21. Padhukasahasram, B., Marjoram, P., Nordborg, M.: Estimating the rate of gene conversion on human chromosome 21. Am. J. Hum. Genet. 75, 386–397 (2004)

    Article  Google Scholar 

  22. Plagnol, V., Padhukasahasram, B., Wall, J.D., Marjoram, P., Nordborg, M.: Relative influences of crossing-over and gene conversion on the pattern of linkage disequilibrium in Arabidopsis thaliana. Genetics (in press), Ahead of Print: 10.1534/genetics.104.040311

    Google Scholar 

  23. Sawyer, S.: Statistical tests for detecting gene conversion. Mol. Biol. Evol. 6, 526–538 (1989)

    Google Scholar 

  24. Song, Y.S., Hein, J.: Parsimonious reconstruction of sequence evolution and haplotype blocks: Finding the minimum number of recombination events. In: Proc. of 2003 Workshop on Algorithms in Bioinformatics, pp. 287–302 (2003)

    Google Scholar 

  25. Song, Y.S., Hein, J.: On the minimum number of recombination events in the evolutionary history of DNA sequences. J. Math. Biol. 48, 160–186 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  26. Song, Y.S., Wu, Y., Gusfield, D.: Efficient computation of close lower and upper bounds on the minimum number of needed recombinations in the evolution of biological sequences. In: Proc. of ISMB 2005, Bioinformatics, vol. 21, pp. i413–i422 (2005)

    Google Scholar 

  27. Stephens, J.C.: Statistical methods of DNA sequence analysis: Detection of intragenic recombination or gene conversion. Mol. Bio. Evol. 2, 539–556 (1985)

    Google Scholar 

  28. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

    Google Scholar 

  29. Wall, J.D.: Close look at gene conversion hot spots. Nat. Genet. 36, 114–115 (2004)

    Article  Google Scholar 

  30. Wiehe, T., Mountain, J., Parham, P., Slatkin, M.: Distinguishing recombination and intragenic gene conversion by linkage disequilibrium patterns. Genet. Res. Camb. 75, 61–73 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Song, Y.S., Ding, Z., Gusfield, D., Langley, C.H., Wu, Y. (2006). Algorithms to Distinguish the Role of Gene-Conversion from Single-Crossover Recombination in the Derivation of SNP Sequences in Populations. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_20

Download citation

  • DOI: https://doi.org/10.1007/11732990_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33295-4

  • Online ISBN: 978-3-540-33296-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics