Abstract
A single nucleotide polymorphism (SNP), as the most common form of genetic variation, has been widely studied to help analyze the possible association between diseases and genomes. To gain more information, SNPs on a single chromosome are usually studied together, which constitute a haplotype. Gaining haplotypes from biological experiments is usually very costly and time-consuming, which causes people to develop efficient methods to determine haplotypes from the computational angle. Many problems and algorithms about haplotypes have been proposed to reduce the cost of studies of disease association. In general, four categories of problems are widely researched: the haplotype assembly problem, the haplotype inference problem, the haplotype block partition problem, and the haplotype tagging SNP selection problem. The former two problems have been well reviewed by many researchers, whereas the latter two have not been comprehensively surveyed to our knowledge. In this paper, we try to make a detailed introduction to the four problems, especially the latter two.
Similar content being viewed by others
References
International HapMap Consortium. The international HapMap project. Nature, 2003, 426: 789–796
Bafna V, Istrail S, Lancia G, et al. Polynomial and APX-hard cases of the individual haplotyping problem. Theoretical Computer Science, 2005, 335(1): 109–125
Clark A G. Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution, 1990, 7(2): 111–122
Patil N, Berno A J, Hinds D A, et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science, 2001, 294(5547): 1719–1723
Bafna V, Halldórsson B V, Schwartz R, et al. Haplotypes and informative SNP selection algorithms: don’t block out information. In: Proceedings of The Seventh Annual International Conference on Research in Computational Molecular Biology (RECOMB), New York: ACM Press, 2003, 19–27
Zhang Q F, Chin F Y L, Shen H. Minimum parent-offspring recombination haplotype inference in pedigrees. Transactions on Computational Systems Biology II, 2005, 2: 100–112
Zhang Q F, Zhao Y Z, Chen G L, et al. Estimate haplotype frequencies in pedigrees. BMC Bioinformatics, 2006, 7(Suppl 4): S5
Zhang Q F, Xu Y, Chen G L, et al. Maximum-likelihood estimation of haplotype frequencies in trio pedigrees. In: Proceedings of The First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS), 2006, 35–39
Zhang Q F, Che H Y, Chen G L, et al. Haplotyping and haplotype frequency estimates on trio genotype data, accepted by Journal of Software (In press)
Bonizzoni P, Vedova G D, Dondi R, et al. The haplotyping problem: an overview of computational models and solutions. Journal of Computer Science and Technology, 2003, 18(6): 675–688
Gusfield D. An overview of combinatorial methods for haplotype inference. In: Istrail S, Waterman M S, Clark A G, eds. Computational methods for SNPs and haplotype inference, Berlin: Springer, volume 2983 of LNCS, 2004, 9–25
Lancia G, Bafna V, Istrail S, et al. SNP problems, complexity and algorithms. In: auf der Heide F M, ed. Algorithms-ESA 2001, Berlin: Springer, volume 2161 of LNCS, 2001, 182–193
Niu T H. Algorithms for inferring haplotypes. Genetic Epidemiology, 2004, 27: 334–347
Zhang X S, Wang R S, Wu L Y, et al. Models and algorithms for haplotyping problem. Current Bioinformatics, 2006, 1(1): 105–114
Gusfield D. Inference of haplotypes from samples of diploid populations: complexity and algorithms. Journal of Computational Biology, 2001, 8(3): 305–323
Halldórsson B V, Bafna V, Edwards N, et al. A survey of computational methods for determining haplotypes. In: Istrail S, Waterman M S, Clark A G, eds. Computational Methods for SNPs and Haplotype Inference, Berlin: Springer, volume 2983 of LNCS, 2004, 26–47
Rizzi R, Bafna V, Istrail S, et al. Practical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem. In: Proceedings of The Second International Workshop on Algorithms in Bioinformatics-(WABI), Berlin: Springer, 2002, 29–43
Cilibrasi R, van Iersel L, Kelk S, et al. On the complexity of several haplotyping problems. In: Proceedings of The Fifth International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2005, 128–139
Cilibrasi R, van Iersel L, Kelk S, et al. On the complexity of the single individual SNP haplotyping problem, 2005, http://www.citebase.org/abstract?id=oai:arXiv.org:q-bio/0508012
Wang R S, Wu L Y, Li Z P, et al. Haplotype reconstruction from SNP fragments by minimum error correction. Bioinformatics, 2005, 21(10): 1456–2462
Daly M J, Rioux J D, Schaffner S F, et al. High-resolution haplotype structure in the human genome. Nature Genetics, 2001, 29: 229–232
Gabriel S B, Schaffner S F, Nguyen H, et al. The structure of haplotype blocks in the human genome. Science, 2002, 296(5576): 2225–2229
Gusfield D. Haplotyping as perfect phylogeny: conceptual framework and efficient solutions. In: Proceedings of The Sixth Annual International Conference on Computational Biology (RECOMB). New York: ACM Press, 2002, 166–175
Li J, Jiang T. Efficient inference of haplotypes from genotypes on a pedigree. Journal of Bioinformatics and Computational Biology, 2003, 1(1): 41–69
Chan M Y, Chan W T, Chin F Y L, et al. Linear-time haplotype inference on pedigrees without recombinations. In: Proceedings of The Sixth International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2006, 56–67
Qian D, Beckmann L. Minimum-recombinant haplotyping in pedigrees. The American Journal of Human Genetics, 2002, 70: 1434–1445
Doi K, Li J, Jiang T. Minimum recombinant haplotype configuration on tree pedigrees. In: Proceedings of The Third International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2003, 339–353
Chin F Y, Zhang Q F, Shen H. K-recombination haplotype inference in pedigrees. In: Proceedings of The International Conference on Computational Science (2) (ICCS), Berlin: Springer, 2005, 985–993
Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution, 1995, 12(5): 921–927
Niu T H, Qin Z S, Xu X, et al. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. The American Journal of Human Genetics, 2002, 70: 157–169
Indap A R, Marth G T, Struble C A, et al. Analysis of concordance of different haplotype block partitioning algorithms. BMC Bioinformatics, 2005, 6: 303
Zhang K, Deng M, Chen T, et al. A dynamic programming algorithm for haplotype block partitioning. In: Proceedings of the National Academy of Science of the United States of America, 2002, 99(11): 7335–7339
Zhang K, Qin Z, Chen T, et al. Hapblock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics, 2005, 21(1): 131–134
Wang N, Akey J M, Zhang K, et al. Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. The American Journal of Human Genetics, 2002, 71: 1227–1234
Mannila H, Koivisto M, Perola M, et al. Minimum description length block finder, a method to identify haplotype blocks and to compare the strength of block boundaries. The American Journal of Human Genetics, 2003, 73: 86–94
Anderson E C, Novembre J. Finding haplotype block boundaries by using the minimum-description-length principle. The American Journal of Human Genetics, 2003, 73: 336–354
Greenspan G, Geiger D. Model-based inference of haplotype block variation. In: Proceedings of The Seventh Annual International Conference on Research in Computational Molecular Biology (RE-COMB), New York: ACM Press, 2003, 131–137
Zhang K, Sun F, Waterman M S, et al. Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data. The American Journal of Human Genetics, 2003, 73: 63–73
Kimmel G, Sharan R, Shamir R. Identifying blocks and subpopulations in noisy SNP data. In: Proceedings of The Third International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2003, 303–319
Zhang K, Qin Z S, Liu J S, et al. Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Research, 2004, 14: 908–916
Ke X, Cardon L R. Efficient selective screening of haplotype tag SNPs. Bioinformatics, 2003, 19(2): 287–288
Sebastiani P, Lazarus R, Weiss S T, et al. Minimal haplotype tagging. Proceedings of the National Academy of Science of the United States of America, 2003, 100(17): 9900–9905
Zhang P, Sheng H, Uehara R. A double classification tree search algorithm for index SNP selection. BMC Bioinformatics, 2004, 5: 89
Johnson G C, Esposito L, Barratt B J, et al. Haplotype tagging for the identification of common disease genes. Nature Genetics, 2001, 29: 233–237
Clayton D. Choosing a set of haplotype tagging SNPs from a larger set of diallelic loci, 2001, http://www.nature.com/ng/journal/v29/n2/extref/ng1001-233-S10.pdf
Meng Z, Zaykin D V, Xu C F, et al. Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. The American Journal of Human Genetics, 2003, 73: 115–130
Judson R, Salisbury B, Schneider J, et al. How many SNPs does a genome-wide haplotype map require? Pharmacogenomics, 2002, 3(3): 379–391
Ackerman H, Usen S, Mott R, et al. Haplotypic analysis of the TNF locus by association efficiency and entropy. Genome Biology, 2003, 4(4): R24
Avi-Itzhak HI, Su X, Vega F M D L. Selection of minimum subsets of single nucleotide polymorphisms to capture haplotype block diversity. In: Proceedings of Pacific Symposium on Biocomputing (PSB), World Scientific, 2003, 466–477
Hao K, Liu S, Niu T. A sparse marker extension tree algorithm for selecting the best set of haplotype tagging single nucleotide polymorphisms. Genetic Epidemiology, 2005, 29: 336–352
Nicolas P, Sun F, Li L M. A model-based approach to selection of tag SNPs. BMC Bioinformatics, 2006, 7: 303
Hampe J, Schreiber S, Krawczak M. Entropy-based SNP selection for genetic association studies. Human Genetics, 2003, 114(1): 36–43
Halldórsson B V, Bafna V, Lippert R, et al. Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies. Genome Research, 2004, 14: 1633–1640
Lee P H, Shatkay H. BNTagger: improved tagging SNP selection using bayesian networks. Bioinformatics, 2006, 22(14): e211–e219
He J, Westbrooks K, Zelikovsky A. Linear reduction method for predictive and informative tag SNP selection. International Journal of Bioinformatics Research and Applications, 2005, 1(3): 249–260
He J, Zhang J, Altun G, et al. Haplotype tagging using support vector machines. In: Proceedings of IEEE International Conference on Granular Computing, 2006, 758–761
Halperin E, Kimmel G, Shamir R. Tag SNP selection in genotype data for maximizing SNP prediction accuracy. Bioinformatics, 2005, 21(Suppl 1): i195–i203
Carlson C S, Eberle M A, Rieder M J, et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. The American Journal of Human Genetics, 2004, 74: 106–120
Byng M C, Whittaker J C, Cuthbert A P, et al. SNP subset selection for genetic association studies. Annals of Human Genetics, 2003, 67: 543–556
Ao S, Yip K, Ng M, et al. Hierarchical clustering and graph methods for selecting tag SNPs. Bioinformatics, 2005, 21(8): 1735–1736
Lin Z, Altman R B. Finding haplotype tagging SNPs by use of principal components analysis. The American Journal of Human Genetics, 2004, 75: 850–861
Phuong T M, Lin Z, Altman R B. Choosing SNPs using feature selection. In: Proceedings of IEEE Computational Systems Bioinformatics Conference, 2005, 301–309
Huang Y T, Zhang K, Chen T, et al. Approximation algorithms for the selection of robust tag SNPs. In: Proceedings of The Fourth International Workshop on Algorithms in Bioinformatics (WABI), Berlin: Springer, 2004, 278–289
Barzuza T, Beckmann J S, Shamir R, et al. Computational problems in perfect phylogeny haplotyping: xor-genotypes and tag SNPs. In: Proceedings of The Fifteenth Annual Symposium on Combinatorial Pattern Matching (CPM), Berlin: Springer, 2004, 14–31
Howie B N, Carlson C S, Rieder M J, et al. Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Human Genetics, 2006, 120(1): 58–68
Burkett K M, Chadessi M, Mcneney B, et al. A comparison of five methods for selecting tagging single-nucleotide polymorphisms. BMC Genetics, 2005, 6(Suppl 1):S71
Ke X, Miretti M M, Broxholme J, et al. A comparison of tagging methods and their tagging space. Human Molecular Genetics, 2005, 14(18): 2757–2767
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, Y., Xu, Y., Zhang, Q. et al. An overview of the haplotype problems and algorithms. Front. Comput. Sc. China 1, 272–282 (2007). https://doi.org/10.1007/s11704-007-0027-y
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11704-007-0027-y