Abstract
Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. Recent genetics research reveals that SNPs within certain haplotype blocks induce only a few distinct common haplotypes in the majority of the population. The existence of haplotype block structure has serious implications for association-based methods for the mapping of disease genes. Our ultimate goal is to select haplotype block designations that best capture the structure within the data.
Here in this paper we propose several efficient combinatorial algorithms related to selecting interesting haplotype blocks under different diversity functions that generalizes many previous results in the literatures. In particular, given an m×n haplotype matrix A, we show linear time algorithms for finding all interval diversities, farthest sites, and the longest block within A. For selecting the multiple long blocks with diversity constraint, we show that selecting k blocks with longest total length can be be found in O(nk) time. We also propose linear time algorithms in calculating the all intra-longest-blocks and all intra-k-longest-blocks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Anderson, E.C., Novembre, J.: Finding Haplotype Block Boundaries by Using the Minimum-Description-Length Principle. Am. J. of Human Genetics 73, 336–354 (2003)
Cole, R., Farach, M., Hariharan, R., Przytycka, T., Thorup, M.: An O(n logn) Algorithm for the Maximum Agreement Subtree Problem for Binary Trees. SIAM Journal on Computing 30(5), 1385–1404 (2002)
Daly, M., Rioux, J., Schafiner, S., Hudson, T., Lander, E.: Highresolution Haplotype Structure in the Human Genome. Nature Genetics 29, 229–232 (2001)
Dawson, E., Abecasis, G., et al.: A First-Generation Linkage Disequilibrium Map of Human Dhromosome 22. Nature 418, 544–548 (2002)
Gabriel, S.B., Schaffner, S.F., Nguyen, H., et al.: The Structure of Haplotype Blocks in the Human Genome. Science 296(5576), 2225–2229 (2002)
Greenspan, G., Geiger, D.: Model-Based Inference of Haplotype Block Variation. In: Seventh Annual International Conference on Computational Molecular Biology (2003)
Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
International HapMap Project, http://www.hapmap.org/index.html.en
Harel, D., Tarjan, R.E.: Fast Algorithms for Finding Nearest Common Ancestors. SIAM Journal on Computing 13(2), 338–355 (1984)
Hudson, R.R., Kaplan, N.L.: Statistical Properties of the Number of Recombination Events in the History of a Sample of DNA Sequences. Genetics 111, 147–164 (1985)
Li, W.H., Graur, D.: Fundamentals of Molecular Evolution. Sinauer Associates, Inc. (1991)
Patil, N., Berno, A.J., Hinds, D.A., et al.: Blocks of Limited Haplotype Diversity Revealed by High Resolution Scanning of Human Chromosome 21. Science 294, 1719–1723 (2001)
Reich, D., Cargill, M., Lander, E., et al.: Linkage Disequilibrium in the Human Genome. Nature 411, 199–204 (2001)
Ukkonen, E.: On-Line Construction of Suffix Trees. Algorithmica 14(3), 249–260 (1995)
Zhang, K., Qin, Z., Chen, T., Liu, J.S., Waterman, M.S., Sun, F.: HapBlock: Haplotype Block Partitioning and Tag SNP Selection Software Using a Set of Dynamic Programming Algorithms. Bioinformatics 21(1), 131–134 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, YL. (2008). Efficient Algorithms for SNP Haplotype Block Selection Problems. In: Hu, X., Wang, J. (eds) Computing and Combinatorics. COCOON 2008. Lecture Notes in Computer Science, vol 5092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69733-6_31
Download citation
DOI: https://doi.org/10.1007/978-3-540-69733-6_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69732-9
Online ISBN: 978-3-540-69733-6
eBook Packages: Computer ScienceComputer Science (R0)