The investigation of genetic differences among humans has given evidence that mutations in DNA sequences are responsible for some genetic diseases. The most common mutation is the one that involves only a single nucleotide of the DNA sequence, which is called a single nucleotide polymorphism (SNP). As a consequence, computing a complete map of all SNPs occurring in the human populations is one of the primary goals of recent studies in human genomics. The construction of such a map requires to determine the DNA sequences that from all chromosomes. In diploid organisms like humans, each chromosome consists of two sequences calledhaplotypes. Distinguishing the information contained in both haplotypes when analyzing chromosome sequences poses several new computational issues which collectively form a new emerging topic of Computational Biology known asHaplotyping.
This paper is a comprehensive study of some new combinatorial approaches proposed in this research area and it mainly focuses on the formulations and algorithmic solutions of some basic biological problems. Three statistical approaches are briefly discussed at the end of the paper.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
International human genome sequencing consortium. Initial sequencing and analysis of the human genome.Nature, February 2001, 409(6822): 860–921.
Venter J Cet al. The sequence of the human genome.Science, 2001, 291(5507): 1304–1351.
Patil N, Berno A Jet al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21.Science, 2001, 294(5547): 1669–1670.
Daly M, Roux J, Schaffer Set al. Fine-Structure Haplotype Map of 5q31: Implications for Gene-Based Studies and Genomic Ld Mapping, 2001.
Gabriel S B, Schaffner S F, Nguyen Het al. The structure of haplotype blocks in the human genome.Science, 2002, 296(5576): 2225–2229.
Lancia G, Bafna V, Istrail Set al. SNPs problems, complexity and algorithms. InProc. 9th European Symp. Algorithms (ESA), 2001, pp. 182–193.
Gusfield D. Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. InProc. 6th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2002, pp.166–175.
Halperin E, Eskin E, Karp R M. Efficient reconstruction of haplotype structure via perfect phylogeny.Journal of Bioinformatics and Computational Biology, to appear.
Halperin E, Eskin E, Karp R M. Large scale reconstruction of haplotypes from genotype data. InProc. 7th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2003, pp.104–113.
Zhang K, Deng M, Chen Tet al. A dynamic programming algorithm for haplotype block partitioning. InProc. The National Academy of Sciences, USA, 2002, 99(11): 7335–7339.
Clark A. Inference of haplotypes from pcr-amplified samples of diploid populations.Molecular Biology and Evolution, 1990, 7(2): 111–122.
Gusfield D. Inference of haplotypes from samples of diploid populations: Complexity and algorithms.Journal of Computational Biology, 2001, 8(3): 305–323.
Gusfield D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge, 1997.
Bafna V, Gusfield D, Lancia G, Yooseph S. Haplotyping as perfect phylogeny: A direct approach.Journal of Computational Biology, to appear.
Helmuth L. Genome research: Map of human genome 3.0.Science 2001, 5530(293): 583–585.
O’Connel J R. Zero-recombinant haplotyping: Applications to fine mapping using snps.Genet. Epidemiol., 2000, 19(Suppl.1): S64–70.
Qian D, Beckmann L. Minimum-recombinant haplotyping in pedigrees.Am. J. Hum. Genet., 2002, 70(6): 1434–1445.
Tapadar P, Ghosh S, Majumder P P. Haplotyping in pedigrees via a genetic algorithm.Hum. Hered., 2000, 50(1): 43–56.
Li J, Jiang T. Efficient rule-based haplotyping algorithms for pedigree data. InProc. 7th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2003, pp.197–206.
Garey M R, Johnson D S. Computer and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.
Doi K, Li J, Jiang T. Minimum recombinant haplotype configuration on tree pedigrees. Accepted bythe 3rd International Workshop on Algorithms in Bioinformatics (WABI), Hungary, 2003.
Rizzi R, Bafna V, Istrail S, Lancia G. Pratical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem. InProc. Algorithms in Bioinformatics, Second International Workshop (WABI 2002), 2003, pp.29–43.
Grötschel M, Lovasz L, Schrijver, A. A polynomial algorithm for perfect graphs.Annals of Discrete Mathematics, 1984, 21: 325–356.
Booth K S, Lueker G S. Testing for the consecutive ones property, interval graphs, and graph planarity using pq-tree algorithms.Journal of Computer and System Sciences, 1976, 13(3): 335–379.
Orzack S, Gusfield D, Stanton V P. The absolute and relative accuracy of haplotype inferral methods and a consensus approach to haplotype inferral. In51st Annual Meeting of the American Society of Human Genetics, 2001.
Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.Molecular Biology and Evolution, 1995, 12(5): 921–927.
Stephens M, Smith N J, Donnelly P. A new statistical method for haplotype reconstruction from population data.American Journal of Human Genetics, 2001, 68: 978–989.
Niu T, Qin Z S, Xu X, Liu J S. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms.American Journal of Human Genetics, 2002, 710: 157–169.
Mitchell T M. Machine Learning. McGraw Hill, New York, 1987.
Author information
Authors and Affiliations
Corresponding author
Additional information
Jing Li is supported by the NSF of USA under Grant No.CGR-9988353.
Paola Bonizzoni is an associate professor of computer science at the Università di Milano-Bicocca in Milan. She received her M.S. degree in computer science from the Università di Milano in 1988 and her Ph.D. degree in computer science from the Universita’ di Milano-Torino in 1993. She has been an assistant professor in computer science at the Università di Milano from 1993 to 1999. She has been visiting research associate at the University of Colorado at Boulder (USA). Her research interests are mainly in the area of theoretical computer science including computational complexity, models in biomolecular computation, graph theory, algorithms on strings, trees and graphs and computational biology. Recently, she was involved in research topics concerning sequence comparison, tree reconstruction and gene finding algorithms. On these subjects she has published several papers in international journals, contributed volumes and conference proceedings.
Gianluca Della Vedova has been appointed as an assistant professor at the Department of Statistics, Università di Milano-Bicocca in 2001. He holds the Ph.D. and the M.Sc. degrees in computer science (Università di Milano). His research interests focus on the design of combinatorial algorithms in bioinformatics and graph theory. He has published several papers on bioinformatics and theoretical computer science in international journals.
Riccardo Dondi is a Ph.D. candidate in computer science, at Universitá di Milano-Bicocca. In 1999 he received the M.Sc. degree in computer science at Università di Milano. His research interests are in the algorithm design and computational complexity of some combinatorial problems in computational biology, in particular the reconstruction and comparison of evolutionary trees.
Jing Li is a Ph.D. candidate in the Department of Computer Science and Engineering at University of California — Riverside. He received the B.S. degree in statistics from Peking University, P.R. China in July 1995 and the M.S. degree in statistical genetics, from Creighton University in Aug. 2000. His recent research interest includes Bioinformatics and computational molecular biology, algorithms and statistical genetics. He has published several papers on bioinformatics and statistical genetics in conferences and journals.
Rights and permissions
About this article
Cite this article
Bonizzoni, P., Della Vedova, G., Dondi, R. et al. The Haplotyping problem: An overview of computational models and solutions. J. Comput. Sci. & Technol. 18, 675–688 (2003). https://doi.org/10.1007/BF02945456
Issue Date:
DOI: https://doi.org/10.1007/BF02945456