Skip to main content

Efficient and Accurate Haplotype Inference by Combining Parsimony and Pedigree Information

  • Conference paper
Algebraic and Numeric Biology

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6479))

Abstract

Existing genotyping technologies have enabled researchers to genotype hundreds of thousands of SNPs efficiently and inexpensively. Methods for the imputation of non-genotyped SNPs and the inference of haplotype information from genotypes, however, remain important, since they have the potential to increase the power of statistical association tests. In many cases, studies are conducted in sets of individuals where the pedigree information is relevant, and can be used to increase the power of tests and to decrease the impact of population structure on the obtained results. This paper proposes a new Boolean optimization model for haplotype inference combining two combinatorial approaches: the Minimum Recombinant Haplotyping Configuration (MRHC), which minimizes the number of recombinant events within a pedigree, and the Haplotype Inference by Pure Parsimony (HIPP), that aims at finding a solution with a minimum number of distinct haplotypes within a population. The paper also describes the use of well-known techniques, which yield significant performance gains. Concrete examples include symmetry breaking, identification of lower bounds, and the use of an appropriate constraint solver. Experimental results show that the new PedRPoly model is competitive both in terms of accuracy and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Achterberg, T., Berthold, T., Koch, T., Wolter, K.: Constraint Integer Programming: A New Approach to Integrate CP and MIP. In: Trick, M.A. (ed.) CPAIOR 2008. LNCS, vol. 5015, pp. 6–20. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  2. Andrés, A., Clark, A., Shimmin, L., Boerwinkle, E., Sing, C., Hixson, J.: Understanding the accuracy of statistical haplotype inference with sequence data of known phase. Genetic Epidemiology 31(7), 659–671 (2007)

    Article  Google Scholar 

  3. Ansótegui, C., Bonet, M.L., Levy, J.: Solving (Weighted) Partial MaxSAT through Satisfiability Testing. In: Kullmann, O. (ed.) SAT 2009. LNCS, vol. 5584, pp. 427–440. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Argelich, J., Lynce, I., Marques-Silva, J.: On solving Boolean multilevel optimization problems. In: International Joint Conference on Artificial Intelligence (IJCAI 2009), pp. 393–398 (2009)

    Google Scholar 

  5. Cheng, I., Penney, K.L., Stram, D.O., Le Marchand, L., Giorgi, E., Haiman, C.A., Kolonel, L.N., Pike, M., Hirschhorn, J., Henderson, B.E., Freedman, M.L.: Haplotype-based association studies of IGFBP1 and IGFBP3 with prostate and breast cancer risk: the multiethnic cohort. Cancer Epidemiol Biomarkers Prev. 15(10), 1993–1997 (2006)

    Article  Google Scholar 

  6. Climer, S., Jäger, G., Templeton, A.R., Zhang, W.: How frugal is mother nature with haplotypes? Bioinformatics 25(1), 68–74 (2009)

    Article  Google Scholar 

  7. Eén, N., Sörensson, N.: Translating pseudo-Boolean constraints into SAT. Journal on Satisfiability, Boolean Modeling and Computation 2, 1–26 (2006)

    MATH  Google Scholar 

  8. Fishelson, M., Dovgolevsky, N., Geiger, D.: Maximum likelihood haplotyping for general pedigrees. Human Heredity 59(1), 41–60 (2005)

    Article  Google Scholar 

  9. Graça, A., Lynce, I., Marques-Silva, J., Oliveira, A.: Haplotype inference combining pedigrees and unrelated individuals. In: Workshop on Constraint Based Methods for Bioinformatics (WCB 2009), pp. 27–36 (2009)

    Google Scholar 

  10. Graça, A., Marques-Silva, J., Lynce, I., Oliveira, A.L.: Efficient Haplotype Inference with Pseudo-boolean Optimization. In: Anai, H., Horimoto, K., Kutsia, T. (eds.) AB 2007. LNCS, vol. 4545, pp. 125–139. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Graça, A., Marques-Silva, J., Lynce, I., Oliveira, A.L.: Efficient Haplotype Inference with Combined CP and OR Techniques. In: Trick, M.A. (ed.) CPAIOR 2008. LNCS, vol. 5015, pp. 308–312. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Gusfield, D.: Haplotype Inference by Pure Parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Haines, J.L.: Chromlook: an interactive program for error detection and mapping in reference linkage data. Genomics 14(2), 517–519 (1992)

    Article  Google Scholar 

  14. Kimura, M.: The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61(4) (1969)

    Google Scholar 

  15. Kirkpatrick, B., Rosa, J., Halperin, E., Karp, R.M.: Haplotype Inference in Complex Pedigrees. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 108–120. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  16. Lancia, G., Pinotti, C.M., Rizzi, R.: Haplotyping populations by pure parsimony: complexity of exact and approximation algorithms. INFORMS Journal on Computing 16(4), 348–359 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  17. Leal, S.M., Yan, K., Müller-Myhsok, B.: SimPed: A simulation program to generate haplotype and genotype data for pedigree structures. Human Heredity 60(2), 119–122 (2005)

    Article  Google Scholar 

  18. Li, C.M., Manyà, F., Mohamedou, N., Planes, J.: Exploiting Cycle Structures in Max-SAT. In: Kullmann, O. (ed.) SAT 2009. LNCS, vol. 5584, pp. 467–480. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Li, J., Jiang, T.: Efficient inference of haplotypes from genotypes on a pedigree. Journal of Bioinformatics and Computational Biology 1(1), 41–69 (2003)

    Article  Google Scholar 

  20. Li, J., Jiang, T.: Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming. Journal of Computational Biology 12(6), 719–739 (2005)

    Article  Google Scholar 

  21. Li, X., Li, J.: Comparison of haplotyping methods using families and unrelated individuals on simulated rheumatoid arthritis data. In: BMC Proceedings, pp. S1–S55 (2007)

    Google Scholar 

  22. Li, X., Li, J.: Efficient haplotype inference from pedigree with missing data using linear systems with disjoint-set data structures. In: International Conference on Computational Systems Bioinformatics (CSB 2008), pp. 297–307 (2008)

    Google Scholar 

  23. Lin, H., Su, K., Li, C.M.: Within-problem learning for efficient lower bound computation in Max-SAT solving. In: National Conference on Artificial Intelligence (AAAI 2008), pp. 351–356 (2008)

    Google Scholar 

  24. Lin, S., Chakravarti, A., Cutler, D.J.: Haplotype and missing data inference in nuclear families. Genome Research 14(8), 1624–1632 (2004)

    Article  Google Scholar 

  25. Liu, L., Xi, C., Xiao, J., Jiang, T.: Complexity and approximation of the minimum recombinant haplotype configuration problem. Theoretical Computer Science 378(3), 316–330 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  26. Lynce, I., Marques-Silva, J., Prestwich, S.: Boosting haplotype inference with local search. Constraints 13(1), 155–179 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Manquinho, V., Marques-Silva, J.: Effective lower bounding techniques for pseudo-Boolean optimization. In: Design, Automation and Test in Europe Conference and Exhibition (DATE 2005), pp. 660–665 (2005)

    Google Scholar 

  28. Marchini, J., Cutler, D., Patterson, N., Stephens, M., Eskin, E., Halperin, E., Lin, S., Qin, Z.S., Munro, H.M., Abecassis, G.R., Donnelly, P., International HapMap Consortium: A comparison of phasing algorithms for trios and unrelated individuals. American Journal of Human Genetics 78(3), 437–450 (2006)

    Article  Google Scholar 

  29. Orzack, S.H., Gusfield, D., Olson, J., Nesbitt, S., Subrahmanyan, L., Stanton, V.P.: Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes. Genetics 165(2), 915–928 (2003)

    Google Scholar 

  30. Pei, Y., Zhang, L., Li, J., Papasian, C.J., Deng, H.-W.: Analyses and comparison of accuracy of different genotype imputation methods. PLoS ONE 3(10) (2008)

    Google Scholar 

  31. Qian, D., Beckmann, L.: Minimum-recombinant haplotyping in pedigrees. American Journal of Human Genetics 70(6), 1434–1445 (2002)

    Article  Google Scholar 

  32. Sánchez, M., Givry, S., Schiex, T.: Mendelian error detection in complex pedigrees using weighted constraint satisfaction techniques. Constraints 13(1-2), 130–154 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  33. The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)

    Article  Google Scholar 

  34. Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14), 1773–1780 (2003)

    Article  Google Scholar 

  35. Wijsman, E.M.: A deductive method of haplotype analysis in pedigrees. American Journal of Human Genetics 41(3), 356–373 (1987)

    Google Scholar 

  36. Zhang, K., Qin, Z., Chen, T., Liu, J.S., Waterman, M.S., Sun, F.: HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21(1), 131–134 (2005)

    Article  Google Scholar 

  37. Zhang, K., Sun, F., Zhao, H.: HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination. Bioinformatics 21(1), 90–103 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Graça, A., Lynce, I., Marques-Silva, J., Oliveira, A.L. (2012). Efficient and Accurate Haplotype Inference by Combining Parsimony and Pedigree Information. In: Horimoto, K., Nakatsui, M., Popov, N. (eds) Algebraic and Numeric Biology. Lecture Notes in Computer Science, vol 6479. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28067-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28067-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28066-5

  • Online ISBN: 978-3-642-28067-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics