Abstract
SNPs are fundamental roles for various applications including medical diagnostic, phylogenies and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Genetic variants that are near each other tend to be inherited together; these regions of linked variants are known as haplotypes. Recently, genetics researches revealed that SNPs within certain haplotype blocks induce only a few distinct common haplotypes in the majority of the population. The existence of haplotype block structure has serious implications for association-based methods for the mapping of disease genes. This paper proposes a parallel haplotype block partition and SNPs selection method under a diversity function by using the Hadoop MapReduce framework. The experiment shows that the proposed MapReduce-paralleled combinatorial algorithm performs well on the real-world data obtained in from the HapMap data set; the computation efficiency can be significantly improved proportional to the number of processors being used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bonnen, P.E., Wang, P.J., Kimmel, M., Chakraborty, R., Nelson, D.L.: Haplotype and linkage disequilibrium architecture for human cancer-associated genes. Genome Res. 12, 1846–1853 (2002)
Gray, I.C., Campbell, D.A., Spurr, N.K.: Single nucleotide polymorphisms as tools in human genetics. Hum. Mol. Genet. 9, 2403–2408 (2000)
Indap, A.R., Marth, G.T., Struble, C.A., Tonellato, P.J., Olivier, M.: Analysis of concordance of different haplotype block partitioning algorithms haplotype tagging for the identification of common disease genes. BMC Bioinformatics 6, 303 (2005)
Mas, A., Blanco, E., Monux, G., Urcelay, E., Serrano, F.J., de la Concha, E.G., Martinez, A.: DRB1-TNF-alpha-TNF-beta haplotype is strongly associated with severe aortoiliac occlusive disease, a clinical form of atherosclerosis. Hum. Immunol. 66, 1062–1067 (2005)
Nowotny, P., Kwon, J.M., Goate, A.M.: SNP analysis to dissect human traits. Curr. Opinion Neurobiol. 11, 637–641 (2001)
Reif, A., Herterich, S., Strobel, A., Ehlis, A.C., Saur, D., Jacob, C.P., Wienker, T., Topner, T., Fritzen, S., Walter, U., Schmitt, A., Fallgatter, A.J., Lesch, K.P.: A neuronal nitri coxide synthase (NOS-I) haplotype associated with schizo-phrenia modifies prefront alcortex function. Mol. Psychiatry 11, 286–300 (2006)
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., Lander, E.S.: High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001)
Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., Liu-Cordero, S.N., Rotimi, C., Adeyemo, A., Cooper, R., Ward, R., Lander, E.S., Daly, M.J., Altshuler, D.: The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002)
Patil, N., Berno, A.J., Hinds, D.A., Barrett, W.A., Doshi, J.M., Hacker, C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P., Nguyen, B.T.N., Norris, M.C., Sheehan, J.B., Shen, N.P., Stern, D., Stokowski, R.P., Thomas, D.J., Trulson, M.O., Vyas, K.R., Frazer, K.A., Fodor, S.P.A., Cox, D.R.: Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromo- some 21. Science 294, 1719–1723 (2001)
Dawson, E., Abecasis, G.R., Bumpstead, S., Chen, Y., Hunt, S., Beare, D.M., Pabial, J., Dibling, T., Tinsley, E., Kirby, S.: First-generation linkage disequilibrium map of human chromosome 22. Nature 418, 544–548 (2002)
Mahdevar, G., Zahiri, J., Sadeghi, M., Nowzari-Dalini, A., Ahrabian, H.: Tag SNP selection via a genetic algorithm. J. Biomed. Inf. (2010), doi:10.1016/j.jbi.2010.05.011
Zhang, K., Calabrese, P., Nordborg, M., Sun, F.: Haplotype block structure and its applications to association studies: power and study designs. Am. J. Hum. Genet. 71, 1386–1394 (2002)
Wall, J.D., Pritchard, J.K.: Assessing the performance of the haplotype block model of linkage disequilibrium. Am. J. Hum. Genet. 73, 502–515 (2003)
Johnson, G.C.L., Esposito, L., Barratt, B.J., Smith, A.N., Heward, J., Di Genova, G., Ueda, H., Cordell, H.J., Eaves, I.A., Dudbridge, F., Twells, R.C.J., Payne, F., Hughes, W., Nutland, S., Stevens, H., Carr, P., Tuomilehto-Wolf, E., Tuomilehto, J., Gough, S.C.L., Clayton, D.G., Todd, J.A.: Haplotype tagging for the identification of common disease genes. Nat. Genet. 29, 233–237 (2001)
Zahirib, J., Mahdevar, G., Nowzari-dalini, A., Ahrabian, H., Sadeghic, M.: A novel efficient dynamic programming algorithm for haplotype block partitioning. J. Theor. Biol. 267, 164–170 (2010)
Greenspan, G., Geiger, D.: High density linkage disequilibrium mapping using models of haplotype block variation. Bioinformatics 20, i137 (2004)
Wang, N., Akey, J.M., Zhang, K., Chakraborty, R., Jin, L.: Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am. J. Hum. Genet. 71, 1227–1234 (2002)
Hudson, R.R., Kaplan, N.L.: Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985)
Hadoop - Apache Software Foundation project home page, http://hadoop.apache.org/
Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 11, S1 (2010)
Dean, J., Ghemawat, S.: MapReduce: A Flexible Data Processing Tool. Communications of the ACM 53, 72–77 (2010)
Schatz, M.: Cloudburst: highly sensitive read mapping with MapReduce. Bioinformatics 25, 1363–1369 (2009)
Lin, Y.L.: Efficient Algorithms for SNP Haplotype Block Selection Problems. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 309–318. Springer, Heidelberg (2008)
Clayton, D.: Choosing a set of haplotype tagging SNPs from a larger set of diallelic loci. Nature Genetics 29(2) (2001)
Zhang, K., Qin, Z., Liu, J.S., Chen, T., Waterman, M.S., Sun, F.: Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Res. 14, 908–916 (2004)
Anderson, E.C., Novembre, J.: Finding haplotype block boundaries by using the minimum-description-length principle. Am. J. of Human Genetics 73, 336–354 (2003)
Li, W.H., Graur, D.: Fundamentals of Molecular Evolution. Sinauer Associates, Inc. (1991)
Chapman, J.M., Cooper, J.D., Todd, J.A., Clayton, D.G.: Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hung, CL., Lin, YL., Hua, GJ., Hu, YC. (2011). CloudTSS: A TagSNP Selection Approach on Cloud Computing. In: Kim, Th., et al. Grid and Distributed Computing. GDC 2011. Communications in Computer and Information Science, vol 261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27180-9_64
Download citation
DOI: https://doi.org/10.1007/978-3-642-27180-9_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27179-3
Online ISBN: 978-3-642-27180-9
eBook Packages: Computer ScienceComputer Science (R0)