Skip to main content
Log in

An overview of the haplotype problems and algorithms

  • Review Article
  • Published:
Frontiers of Computer Science in China Aims and scope Submit manuscript

Abstract

A single nucleotide polymorphism (SNP), as the most common form of genetic variation, has been widely studied to help analyze the possible association between diseases and genomes. To gain more information, SNPs on a single chromosome are usually studied together, which constitute a haplotype. Gaining haplotypes from biological experiments is usually very costly and time-consuming, which causes people to develop efficient methods to determine haplotypes from the computational angle. Many problems and algorithms about haplotypes have been proposed to reduce the cost of studies of disease association. In general, four categories of problems are widely researched: the haplotype assembly problem, the haplotype inference problem, the haplotype block partition problem, and the haplotype tagging SNP selection problem. The former two problems have been well reviewed by many researchers, whereas the latter two have not been comprehensively surveyed to our knowledge. In this paper, we try to make a detailed introduction to the four problems, especially the latter two.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. International HapMap Consortium. The international HapMap project. Nature, 2003, 426: 789–796

    Article  Google Scholar 

  2. Bafna V, Istrail S, Lancia G, et al. Polynomial and APX-hard cases of the individual haplotyping problem. Theoretical Computer Science, 2005, 335(1): 109–125

    Article  MATH  MathSciNet  Google Scholar 

  3. Clark A G. Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution, 1990, 7(2): 111–122

    Google Scholar 

  4. Patil N, Berno A J, Hinds D A, et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science, 2001, 294(5547): 1719–1723

    Article  Google Scholar 

  5. Bafna V, Halldórsson B V, Schwartz R, et al. Haplotypes and informative SNP selection algorithms: don’t block out information. In: Proceedings of The Seventh Annual International Conference on Research in Computational Molecular Biology (RECOMB), New York: ACM Press, 2003, 19–27

    Chapter  Google Scholar 

  6. Zhang Q F, Chin F Y L, Shen H. Minimum parent-offspring recombination haplotype inference in pedigrees. Transactions on Computational Systems Biology II, 2005, 2: 100–112

    Article  MathSciNet  Google Scholar 

  7. Zhang Q F, Zhao Y Z, Chen G L, et al. Estimate haplotype frequencies in pedigrees. BMC Bioinformatics, 2006, 7(Suppl 4): S5

    Article  Google Scholar 

  8. Zhang Q F, Xu Y, Chen G L, et al. Maximum-likelihood estimation of haplotype frequencies in trio pedigrees. In: Proceedings of The First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS), 2006, 35–39

  9. Zhang Q F, Che H Y, Chen G L, et al. Haplotyping and haplotype frequency estimates on trio genotype data, accepted by Journal of Software (In press)

  10. Bonizzoni P, Vedova G D, Dondi R, et al. The haplotyping problem: an overview of computational models and solutions. Journal of Computer Science and Technology, 2003, 18(6): 675–688

    Article  MATH  MathSciNet  Google Scholar 

  11. Gusfield D. An overview of combinatorial methods for haplotype inference. In: Istrail S, Waterman M S, Clark A G, eds. Computational methods for SNPs and haplotype inference, Berlin: Springer, volume 2983 of LNCS, 2004, 9–25

    Google Scholar 

  12. Lancia G, Bafna V, Istrail S, et al. SNP problems, complexity and algorithms. In: auf der Heide F M, ed. Algorithms-ESA 2001, Berlin: Springer, volume 2161 of LNCS, 2001, 182–193

    Chapter  Google Scholar 

  13. Niu T H. Algorithms for inferring haplotypes. Genetic Epidemiology, 2004, 27: 334–347

    Article  Google Scholar 

  14. Zhang X S, Wang R S, Wu L Y, et al. Models and algorithms for haplotyping problem. Current Bioinformatics, 2006, 1(1): 105–114

    Article  MathSciNet  Google Scholar 

  15. Gusfield D. Inference of haplotypes from samples of diploid populations: complexity and algorithms. Journal of Computational Biology, 2001, 8(3): 305–323

    Article  Google Scholar 

  16. Halldórsson B V, Bafna V, Edwards N, et al. A survey of computational methods for determining haplotypes. In: Istrail S, Waterman M S, Clark A G, eds. Computational Methods for SNPs and Haplotype Inference, Berlin: Springer, volume 2983 of LNCS, 2004, 26–47

    Google Scholar 

  17. Rizzi R, Bafna V, Istrail S, et al. Practical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem. In: Proceedings of The Second International Workshop on Algorithms in Bioinformatics-(WABI), Berlin: Springer, 2002, 29–43

    Google Scholar 

  18. Cilibrasi R, van Iersel L, Kelk S, et al. On the complexity of several haplotyping problems. In: Proceedings of The Fifth International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2005, 128–139

    Google Scholar 

  19. Cilibrasi R, van Iersel L, Kelk S, et al. On the complexity of the single individual SNP haplotyping problem, 2005, http://www.citebase.org/abstract?id=oai:arXiv.org:q-bio/0508012

  20. Wang R S, Wu L Y, Li Z P, et al. Haplotype reconstruction from SNP fragments by minimum error correction. Bioinformatics, 2005, 21(10): 1456–2462

    Article  Google Scholar 

  21. Daly M J, Rioux J D, Schaffner S F, et al. High-resolution haplotype structure in the human genome. Nature Genetics, 2001, 29: 229–232

    Article  Google Scholar 

  22. Gabriel S B, Schaffner S F, Nguyen H, et al. The structure of haplotype blocks in the human genome. Science, 2002, 296(5576): 2225–2229

    Article  Google Scholar 

  23. Gusfield D. Haplotyping as perfect phylogeny: conceptual framework and efficient solutions. In: Proceedings of The Sixth Annual International Conference on Computational Biology (RECOMB). New York: ACM Press, 2002, 166–175

    Chapter  Google Scholar 

  24. Li J, Jiang T. Efficient inference of haplotypes from genotypes on a pedigree. Journal of Bioinformatics and Computational Biology, 2003, 1(1): 41–69

    Article  Google Scholar 

  25. Chan M Y, Chan W T, Chin F Y L, et al. Linear-time haplotype inference on pedigrees without recombinations. In: Proceedings of The Sixth International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2006, 56–67

    Google Scholar 

  26. Qian D, Beckmann L. Minimum-recombinant haplotyping in pedigrees. The American Journal of Human Genetics, 2002, 70: 1434–1445

    Article  Google Scholar 

  27. Doi K, Li J, Jiang T. Minimum recombinant haplotype configuration on tree pedigrees. In: Proceedings of The Third International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2003, 339–353

    Google Scholar 

  28. Chin F Y, Zhang Q F, Shen H. K-recombination haplotype inference in pedigrees. In: Proceedings of The International Conference on Computational Science (2) (ICCS), Berlin: Springer, 2005, 985–993

    Google Scholar 

  29. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution, 1995, 12(5): 921–927

    Google Scholar 

  30. Niu T H, Qin Z S, Xu X, et al. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. The American Journal of Human Genetics, 2002, 70: 157–169

    Article  Google Scholar 

  31. Indap A R, Marth G T, Struble C A, et al. Analysis of concordance of different haplotype block partitioning algorithms. BMC Bioinformatics, 2005, 6: 303

    Article  Google Scholar 

  32. Zhang K, Deng M, Chen T, et al. A dynamic programming algorithm for haplotype block partitioning. In: Proceedings of the National Academy of Science of the United States of America, 2002, 99(11): 7335–7339

    Article  MATH  Google Scholar 

  33. Zhang K, Qin Z, Chen T, et al. Hapblock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics, 2005, 21(1): 131–134

    Article  Google Scholar 

  34. Wang N, Akey J M, Zhang K, et al. Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. The American Journal of Human Genetics, 2002, 71: 1227–1234

    Article  Google Scholar 

  35. Mannila H, Koivisto M, Perola M, et al. Minimum description length block finder, a method to identify haplotype blocks and to compare the strength of block boundaries. The American Journal of Human Genetics, 2003, 73: 86–94

    Article  Google Scholar 

  36. Anderson E C, Novembre J. Finding haplotype block boundaries by using the minimum-description-length principle. The American Journal of Human Genetics, 2003, 73: 336–354

    Article  Google Scholar 

  37. Greenspan G, Geiger D. Model-based inference of haplotype block variation. In: Proceedings of The Seventh Annual International Conference on Research in Computational Molecular Biology (RE-COMB), New York: ACM Press, 2003, 131–137

    Chapter  Google Scholar 

  38. Zhang K, Sun F, Waterman M S, et al. Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data. The American Journal of Human Genetics, 2003, 73: 63–73

    Article  Google Scholar 

  39. Kimmel G, Sharan R, Shamir R. Identifying blocks and subpopulations in noisy SNP data. In: Proceedings of The Third International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2003, 303–319

    Google Scholar 

  40. Zhang K, Qin Z S, Liu J S, et al. Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Research, 2004, 14: 908–916

    Article  Google Scholar 

  41. Ke X, Cardon L R. Efficient selective screening of haplotype tag SNPs. Bioinformatics, 2003, 19(2): 287–288

    Article  Google Scholar 

  42. Sebastiani P, Lazarus R, Weiss S T, et al. Minimal haplotype tagging. Proceedings of the National Academy of Science of the United States of America, 2003, 100(17): 9900–9905

    Article  Google Scholar 

  43. Zhang P, Sheng H, Uehara R. A double classification tree search algorithm for index SNP selection. BMC Bioinformatics, 2004, 5: 89

    Article  Google Scholar 

  44. Johnson G C, Esposito L, Barratt B J, et al. Haplotype tagging for the identification of common disease genes. Nature Genetics, 2001, 29: 233–237

    Article  Google Scholar 

  45. Clayton D. Choosing a set of haplotype tagging SNPs from a larger set of diallelic loci, 2001, http://www.nature.com/ng/journal/v29/n2/extref/ng1001-233-S10.pdf

  46. Meng Z, Zaykin D V, Xu C F, et al. Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. The American Journal of Human Genetics, 2003, 73: 115–130

    Article  Google Scholar 

  47. Judson R, Salisbury B, Schneider J, et al. How many SNPs does a genome-wide haplotype map require? Pharmacogenomics, 2002, 3(3): 379–391

    Article  Google Scholar 

  48. Ackerman H, Usen S, Mott R, et al. Haplotypic analysis of the TNF locus by association efficiency and entropy. Genome Biology, 2003, 4(4): R24

    Article  Google Scholar 

  49. Avi-Itzhak HI, Su X, Vega F M D L. Selection of minimum subsets of single nucleotide polymorphisms to capture haplotype block diversity. In: Proceedings of Pacific Symposium on Biocomputing (PSB), World Scientific, 2003, 466–477

  50. Hao K, Liu S, Niu T. A sparse marker extension tree algorithm for selecting the best set of haplotype tagging single nucleotide polymorphisms. Genetic Epidemiology, 2005, 29: 336–352

    Article  Google Scholar 

  51. Nicolas P, Sun F, Li L M. A model-based approach to selection of tag SNPs. BMC Bioinformatics, 2006, 7: 303

    Article  Google Scholar 

  52. Hampe J, Schreiber S, Krawczak M. Entropy-based SNP selection for genetic association studies. Human Genetics, 2003, 114(1): 36–43

    Article  Google Scholar 

  53. Halldórsson B V, Bafna V, Lippert R, et al. Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies. Genome Research, 2004, 14: 1633–1640

    Article  Google Scholar 

  54. Lee P H, Shatkay H. BNTagger: improved tagging SNP selection using bayesian networks. Bioinformatics, 2006, 22(14): e211–e219

    Article  Google Scholar 

  55. He J, Westbrooks K, Zelikovsky A. Linear reduction method for predictive and informative tag SNP selection. International Journal of Bioinformatics Research and Applications, 2005, 1(3): 249–260

    Article  Google Scholar 

  56. He J, Zhang J, Altun G, et al. Haplotype tagging using support vector machines. In: Proceedings of IEEE International Conference on Granular Computing, 2006, 758–761

  57. Halperin E, Kimmel G, Shamir R. Tag SNP selection in genotype data for maximizing SNP prediction accuracy. Bioinformatics, 2005, 21(Suppl 1): i195–i203

    Article  Google Scholar 

  58. Carlson C S, Eberle M A, Rieder M J, et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. The American Journal of Human Genetics, 2004, 74: 106–120

    Article  Google Scholar 

  59. Byng M C, Whittaker J C, Cuthbert A P, et al. SNP subset selection for genetic association studies. Annals of Human Genetics, 2003, 67: 543–556

    Article  Google Scholar 

  60. Ao S, Yip K, Ng M, et al. Hierarchical clustering and graph methods for selecting tag SNPs. Bioinformatics, 2005, 21(8): 1735–1736

    Article  Google Scholar 

  61. Lin Z, Altman R B. Finding haplotype tagging SNPs by use of principal components analysis. The American Journal of Human Genetics, 2004, 75: 850–861

    Article  Google Scholar 

  62. Phuong T M, Lin Z, Altman R B. Choosing SNPs using feature selection. In: Proceedings of IEEE Computational Systems Bioinformatics Conference, 2005, 301–309

  63. Huang Y T, Zhang K, Chen T, et al. Approximation algorithms for the selection of robust tag SNPs. In: Proceedings of The Fourth International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2004, 278–289

    Google Scholar 

  64. Barzuza T, Beckmann J S, Shamir R, et al. Computational problems in perfect phylogeny haplotyping: xor-genotypes and tag SNPs. In: Proceedings of The Fifteenth Annual Symposium on Combinatorial Pattern Matching (CPM), Berlin: Springer, 2004, 14–31

    Google Scholar 

  65. Howie B N, Carlson C S, Rieder M J, et al. Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Human Genetics, 2006, 120(1): 58–68

    Article  Google Scholar 

  66. Burkett K M, Chadessi M, Mcneney B, et al. A comparison of five methods for selecting tagging single-nucleotide polymorphisms. BMC Genetics, 2005, 6(Suppl 1):S71

    Article  Google Scholar 

  67. Ke X, Miretti M M, Broxholme J, et al. A comparison of tagging methods and their tagging space. Human Molecular Genetics, 2005, 14(18): 2757–2767

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhao Yuzhong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, Y., Xu, Y., Zhang, Q. et al. An overview of the haplotype problems and algorithms. Front. Comput. Sc. China 1, 272–282 (2007). https://doi.org/10.1007/s11704-007-0027-y

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-007-0027-y

Keywords

Navigation