Skip to main content
Log in

The Haplotyping problem: An overview of computational models and solutions

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

The investigation of genetic differences among humans has given evidence that mutations in DNA sequences are responsible for some genetic diseases. The most common mutation is the one that involves only a single nucleotide of the DNA sequence, which is called a single nucleotide polymorphism (SNP). As a consequence, computing a complete map of all SNPs occurring in the human populations is one of the primary goals of recent studies in human genomics. The construction of such a map requires to determine the DNA sequences that from all chromosomes. In diploid organisms like humans, each chromosome consists of two sequences calledhaplotypes. Distinguishing the information contained in both haplotypes when analyzing chromosome sequences poses several new computational issues which collectively form a new emerging topic of Computational Biology known asHaplotyping.

This paper is a comprehensive study of some new combinatorial approaches proposed in this research area and it mainly focuses on the formulations and algorithmic solutions of some basic biological problems. Three statistical approaches are briefly discussed at the end of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. International human genome sequencing consortium. Initial sequencing and analysis of the human genome.Nature, February 2001, 409(6822): 860–921.

    Google Scholar 

  2. Venter J Cet al. The sequence of the human genome.Science, 2001, 291(5507): 1304–1351.

    Article  Google Scholar 

  3. Patil N, Berno A Jet al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21.Science, 2001, 294(5547): 1669–1670.

    Article  Google Scholar 

  4. Daly M, Roux J, Schaffer Set al. Fine-Structure Haplotype Map of 5q31: Implications for Gene-Based Studies and Genomic Ld Mapping, 2001.

  5. Gabriel S B, Schaffner S F, Nguyen Het al. The structure of haplotype blocks in the human genome.Science, 2002, 296(5576): 2225–2229.

    Article  Google Scholar 

  6. Lancia G, Bafna V, Istrail Set al. SNPs problems, complexity and algorithms. InProc. 9th European Symp. Algorithms (ESA), 2001, pp. 182–193.

  7. Gusfield D. Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. InProc. 6th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2002, pp.166–175.

  8. Halperin E, Eskin E, Karp R M. Efficient reconstruction of haplotype structure via perfect phylogeny.Journal of Bioinformatics and Computational Biology, to appear.

  9. Halperin E, Eskin E, Karp R M. Large scale reconstruction of haplotypes from genotype data. InProc. 7th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2003, pp.104–113.

  10. Zhang K, Deng M, Chen Tet al. A dynamic programming algorithm for haplotype block partitioning. InProc. The National Academy of Sciences, USA, 2002, 99(11): 7335–7339.

  11. Clark A. Inference of haplotypes from pcr-amplified samples of diploid populations.Molecular Biology and Evolution, 1990, 7(2): 111–122.

    Google Scholar 

  12. Gusfield D. Inference of haplotypes from samples of diploid populations: Complexity and algorithms.Journal of Computational Biology, 2001, 8(3): 305–323.

    Article  Google Scholar 

  13. Gusfield D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge, 1997.

    MATH  Google Scholar 

  14. Bafna V, Gusfield D, Lancia G, Yooseph S. Haplotyping as perfect phylogeny: A direct approach.Journal of Computational Biology, to appear.

  15. Helmuth L. Genome research: Map of human genome 3.0.Science 2001, 5530(293): 583–585.

    Article  Google Scholar 

  16. O’Connel J R. Zero-recombinant haplotyping: Applications to fine mapping using snps.Genet. Epidemiol., 2000, 19(Suppl.1): S64–70.

    Article  Google Scholar 

  17. Qian D, Beckmann L. Minimum-recombinant haplotyping in pedigrees.Am. J. Hum. Genet., 2002, 70(6): 1434–1445.

    Article  Google Scholar 

  18. Tapadar P, Ghosh S, Majumder P P. Haplotyping in pedigrees via a genetic algorithm.Hum. Hered., 2000, 50(1): 43–56.

    Article  Google Scholar 

  19. Li J, Jiang T. Efficient rule-based haplotyping algorithms for pedigree data. InProc. 7th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2003, pp.197–206.

  20. Garey M R, Johnson D S. Computer and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.

  21. Doi K, Li J, Jiang T. Minimum recombinant haplotype configuration on tree pedigrees. Accepted bythe 3rd International Workshop on Algorithms in Bioinformatics (WABI), Hungary, 2003.

  22. Rizzi R, Bafna V, Istrail S, Lancia G. Pratical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem. InProc. Algorithms in Bioinformatics, Second International Workshop (WABI 2002), 2003, pp.29–43.

  23. Grötschel M, Lovasz L, Schrijver, A. A polynomial algorithm for perfect graphs.Annals of Discrete Mathematics, 1984, 21: 325–356.

    Google Scholar 

  24. Booth K S, Lueker G S. Testing for the consecutive ones property, interval graphs, and graph planarity using pq-tree algorithms.Journal of Computer and System Sciences, 1976, 13(3): 335–379.

    MATH  MathSciNet  Google Scholar 

  25. Orzack S, Gusfield D, Stanton V P. The absolute and relative accuracy of haplotype inferral methods and a consensus approach to haplotype inferral. In51st Annual Meeting of the American Society of Human Genetics, 2001.

  26. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.Molecular Biology and Evolution, 1995, 12(5): 921–927.

    Google Scholar 

  27. Stephens M, Smith N J, Donnelly P. A new statistical method for haplotype reconstruction from population data.American Journal of Human Genetics, 2001, 68: 978–989.

    Article  Google Scholar 

  28. Niu T, Qin Z S, Xu X, Liu J S. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms.American Journal of Human Genetics, 2002, 710: 157–169.

    Article  Google Scholar 

  29. Mitchell T M. Machine Learning. McGraw Hill, New York, 1987.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paola Bonizzoni.

Additional information

Jing Li is supported by the NSF of USA under Grant No.CGR-9988353.

Paola Bonizzoni is an associate professor of computer science at the Università di Milano-Bicocca in Milan. She received her M.S. degree in computer science from the Università di Milano in 1988 and her Ph.D. degree in computer science from the Universita’ di Milano-Torino in 1993. She has been an assistant professor in computer science at the Università di Milano from 1993 to 1999. She has been visiting research associate at the University of Colorado at Boulder (USA). Her research interests are mainly in the area of theoretical computer science including computational complexity, models in biomolecular computation, graph theory, algorithms on strings, trees and graphs and computational biology. Recently, she was involved in research topics concerning sequence comparison, tree reconstruction and gene finding algorithms. On these subjects she has published several papers in international journals, contributed volumes and conference proceedings.

Gianluca Della Vedova has been appointed as an assistant professor at the Department of Statistics, Università di Milano-Bicocca in 2001. He holds the Ph.D. and the M.Sc. degrees in computer science (Università di Milano). His research interests focus on the design of combinatorial algorithms in bioinformatics and graph theory. He has published several papers on bioinformatics and theoretical computer science in international journals.

Riccardo Dondi is a Ph.D. candidate in computer science, at Universitá di Milano-Bicocca. In 1999 he received the M.Sc. degree in computer science at Università di Milano. His research interests are in the algorithm design and computational complexity of some combinatorial problems in computational biology, in particular the reconstruction and comparison of evolutionary trees.

Jing Li is a Ph.D. candidate in the Department of Computer Science and Engineering at University of California — Riverside. He received the B.S. degree in statistics from Peking University, P.R. China in July 1995 and the M.S. degree in statistical genetics, from Creighton University in Aug. 2000. His recent research interest includes Bioinformatics and computational molecular biology, algorithms and statistical genetics. He has published several papers on bioinformatics and statistical genetics in conferences and journals.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonizzoni, P., Della Vedova, G., Dondi, R. et al. The Haplotyping problem: An overview of computational models and solutions. J. Comput. Sci. & Technol. 18, 675–688 (2003). https://doi.org/10.1007/BF02945456

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02945456

Keywords

Navigation