Skip to main content

An Efficient Algorithm for Haplotype Inference on Pedigrees with a Small Number of Recombinants (Extended Abstract)

  • Conference paper
Algorithms - ESA 2009 (ESA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5757))

Included in the following conference series:

Abstract

Combinatorial (or rule-based) methods for inferring haplotypes from genotypes on a pedigree have been studied extensively in the recent literature. These methods generally try to reconstruct the haplotypes of each individual so that the total number of recombinants is minimized in the pedigree. The problem is NP-hard, although it is known that the number of recombinants in a practical dataset is usually very small. In this paper, we consider the question of how to efficiently infer haplotypes on a large pedigree when the number of recombinants is bounded by a small constant, i.e. the so called k-recombinant haplotype configuration (k-RHC) problem. We introduce a simple probabilistic model for k-RHC where the prior haplotype probability of a founder and the haplotype transmission probability from a parent to a child are all assumed to follow the uniform distribution and k random recombinants are assumed to have taken place uniformly and independently in the pedigree. We present an O(mnlogk + 1 n) time algorithm for k-RHC on tree pedigrees without mating loops, where m is the number of loci and n is the size of the input pedigree, and prove that when 90logn < m < n 3, the algorithm can correctly find a feasible haplotype configuration that obeys the Mendelian law of inheritance and requires no more than k recombinants with probability \(1 - O(k^2\frac{\log^2n}{mn}+\frac{1}{n^2})\). The algorithm is efficient when k is of a moderate value and could thus be used to infer haplotypes from genotypes on large tree pedigrees efficiently in practice. We have implemented the algorithm as a C++ program named Tree- k -RHC. The implementation incorporates several ideas for dealing with missing data and data with a large number of recombinants effectively. Our experimental results on both simulated and real datasets show that Tree- k -RHC can reconstruct haplotypes with a high accuracy and is much faster than the best combinatorial method in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abecasis, G.R., et al.: Nat Genet, 30(1), 97–101 (2002)

    Google Scholar 

  2. Albers, C.A., et al.: Genetics 177, 1101–1116 (2007)

    Google Scholar 

  3. Axenovich, T.I., et al.: Human Heredity  65(2), 57–65 (2008)

    Google Scholar 

  4. Baruch, E., et al.: Genetics 172, 1757–1765 (2006)

    Google Scholar 

  5. Chan, M.Y., et al.: SIAM Journal on Computing 38(6), 2179–2197 (2009)

    Google Scholar 

  6. Chin, F., et al.: Proc. 5th ICCS, Atlanta, GA, pp. 985–993 (2005)

    Google Scholar 

  7. Doi, K., et al.: Minimum recombinant haplotype configuration on tree pedigrees. In: Benson, G., Page, R.D.M., et al. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 339–353. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  8. Downey, R., Fellows, M.: Parameterized Complexity. Springer, Heidelberg (1999)

    Book  MATH  Google Scholar 

  9. Excoffier, L., Slatkin, M.: Mol. Biol. Evol.  12, 921–927 (1995)

    Google Scholar 

  10. Gabriel, S.B., et al.: Science 296(5576), 2225–2229

    Google Scholar 

  11. Griffiths, A., et al.: Modern Genetic Analysis: Integrating Genes and Genomes. W.H. Freeman and Company, New York (2002)

    Google Scholar 

  12. Gudbjartsson, D.F., et al.: Nat. Genet.  25(1), 12–13 (2000)

    Google Scholar 

  13. Haplotype Conference (May 2008), http://www.soph.uab.edu/ssg/nhgri/haplotype2008

  14. Kruglyak, L., et al.: Am. J. Hum. Genet. 58, 1347–1363 (1996)

    Google Scholar 

  15. Lander, E.S., Green, P.: Proc. Natl. Acad. Sci. USA.  84, 2363–2367 (1987)

    Google Scholar 

  16. Li, J., Jiang, T.: Proc. 7th RECOMB, pp. 197–206 (2003)

    Google Scholar 

  17. Li, J., Jiang, T.: Proc. 8th RECOMB, pp. 20–29 (2004)

    Google Scholar 

  18. Li, J., Jiang, T.: J. Comput. Biol. 12(6), 719–739 (2005)

    Google Scholar 

  19. Li, J., Jiang, T.: J. Bioinformatics and Computational Biology  6(1), 241–259 (2008)

    Google Scholar 

  20. Li, X., Li, J.: Proc. 7th CSB, pp. 297–308 (2008)

    Google Scholar 

  21. Liu, L., et al.: Complexity and approximation of the minimum recombination haplotype configuration problem. In: Deng, X., Du, D.-Z. (eds.) ISAAC 2005. LNCS, vol. 3827, pp. 370–379. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  22. Liu, L., Jiang, T.: Proc. 18th GIW, pp. 95–106, Singapore (December 2007)

    Google Scholar 

  23. O’Connell, J.R.: Genet. Epidemiol. 19(suppl.1), S64–S70 (2000)

    Google Scholar 

  24. Piccolboni, A., Gusfield, D.: Journal of Computational Biololgy  10(5), 763–773 (2003)

    Google Scholar 

  25. Qian, D., Beckmann, L.: Am J Hum Genet, 70(6), 1434–1445 (2002)

    Google Scholar 

  26. Sobel, E., et al.: In: Speed, T., Waterman, M. (eds.) Genetic Mapping and DNA Sequencing, IMA Vol in Math. and its App., vol. 81, pp. 89–110 (1996)

    Google Scholar 

  27. The International HapMap Consortium. Nature 426, 789–796 (2003)

    Google Scholar 

  28. Wang, C., et al.: Journal of Chinese Science Bulletin  52(4), 471–476 (2007)

    Google Scholar 

  29. Wilson, I.J., Dawson, K.J.: Theor. Popul. Biol. 72(3), 436–458 (2007)

    Google Scholar 

  30. Xiao, J., et al.: Proc. 18th SODA, pp. 655–664 (2007)

    Google Scholar 

  31. Xiao, J., et al.: SIAM Journal on Computing  38(6), 2198–2219 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xiao, J., Lou, T., Jiang, T. (2009). An Efficient Algorithm for Haplotype Inference on Pedigrees with a Small Number of Recombinants (Extended Abstract). In: Fiat, A., Sanders, P. (eds) Algorithms - ESA 2009. ESA 2009. Lecture Notes in Computer Science, vol 5757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04128-0_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04128-0_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04127-3

  • Online ISBN: 978-3-642-04128-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics