Skip to main content

Two Birds, One Stone: Selecting Functionally Informative Tag SNPs for Disease Association Studies

  • Conference paper
Algorithms in Bioinformatics (WABI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4645))

Included in the following conference series:

Abstract

Selecting an informative subset of SNPs, generally referred to as tag SNPs, to genotype and analyze is considered to be an essential step toward effective disease association studies. However, while the selected informative tag SNPs may characterize the allele information of a target genomic region, they are not necessarily the ones directly associated with disease or with functional impairment. To address this limitation, we present a first integrative SNP selection system that simultaneously identifies SNPs that are both informative and carry a deleterious functional effect – which in turn means that they are likely to be directly associated with disease. We formulate the problem of selecting functionally informative tag SNPs as a multi-objective optimization problem and present a heuristic algorithm for addressing it. We also present the system we developed for assessing the functional significance of SNPs. To evaluate our system, we compare it to other state-of-the-art SNP selection systems, which conduct both information-based tag SNP selection and function-based SNP selection, but do so in two separate consecutive steps. Using 14 datasets, based on disease-related genes curated by the OMIM database, we show that our system consistently improves upon current systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hedrick, P.: Genetics of pouplation, 3rd edn. Jones and Bartlett Publishers (2004)

    Google Scholar 

  2. Bhatti, P., Church, D., Rutter, J.L., Struewing, J.P., Sigurdson, A.J.: Candidate single nucleotide polymorphism selection using publicly available tools: a guide for epidemiologists. American Journal of Epidemiology 164, 794–804 (2006)

    Article  Google Scholar 

  3. Sherry, S., Ward, M., Kholodov, M., Baker, J., Phan, L., Smigielski, E., Sirotkin, K.: dbSNP: the NCBI database of genetic variation. Nucleic Acids Research 29, 308–311 (2001)

    Article  Google Scholar 

  4. Brunham, L.R., Singaraja, R.R., Pape, T.D., Kejariwai, A., Thomas, P.D., Hayden, M.R.: Accurate prediction of the functional significance of single nucleotide polymorphisms and mutations in the ABCA1 gene. PLOS Genetics 1, 739–747 (2005)

    Article  Google Scholar 

  5. Rebbeck, T.R., Ambrosone, C.B., Bell, D.A., Chanock, S.J., Hayes, R.B., Kadlubar, F.F., Thomas, D.C.: SNPs, haplotypes, and cancer: applications in molecular epidemiology. Cancer Epidemiology, Biomarkers & Prevention 13, 681–687 (2004)

    Google Scholar 

  6. Conde, L., Vaquerizas, J.M., Ferrer-Costa, C., de la Cruz, X., Orozco, M., Dopazo1, J.: PupasView: a visual tool for selecting suitable SNPs, with putative pathological effect in genes, for genotyping purposes. American Journal of Epidemiology 33, 501–505 (2005)

    Google Scholar 

  7. Hemminger, B.M., Saelim, B., Sullivan, P.F.: TAMAL: an integrated approach to choosing SNPs for genetic studies of human complex traits. Bioinformatics 22, 626–627 (2006)

    Article  Google Scholar 

  8. Xu, H., Gregory, S.G., Hauser, E.R., Stenger, J.E., Pericak-Vance, M.A., Vance, J.M., Zuchner, S., Hauser, M.A.: SNPselector: a web tool for selecting SNPs for genetic association studies. Bioinformatics 21, 4181–4186 (2005)

    Article  Google Scholar 

  9. Lee, P.H., Shatkay, H.: BNTagger: improved tagging SNP selection using Bayesian networks. Bioinformatics 22, e211–219 (2006)

    Article  Google Scholar 

  10. Sebastiani, P., Lazarus, R., Weiss, S.T., Kunkel, L.M., Kohane, I.S., Ramoni, M.F.: Minimal haplotype tagging. Proceedings of the National Academy of Sciences 100, 9900–9905 (2003)

    Article  Google Scholar 

  11. Halperin, E., Kimmel, G., Sharmir, R.: Tag SNP selection in genotype data for maximizing SNP prediction accuracy. Bioinformatics 21, i195–i203 (2005)

    Google Scholar 

  12. Bafna, V., Halldorsson, B.V., Schwartz, R., Clark, A.G., Istrail, S.: Haplotypes and Informative SNP Selection Algorithms: Don’t Block Out Information. In: Proceedings of the 7th International Conference on Computational Molecular Biology, pp. 19–26 (2003)

    Google Scholar 

  13. Bakker, P.D., Graham, R.R., Altshuler, D., Henderson, B., Haiman, C.: Transferability of tag SNPs to capture common genetic variation in DNA repair genes across multiple population. In: Proceedings of Pacific Symposium on Biocomputing (2006)

    Google Scholar 

  14. Halldorsson, B.V., Istrail, S., Vega, F.D.L.: Optimal selection of SNP markers for disease association studies. American Journal of Epidemiology 58(3-4), 190–202 (2004)

    Google Scholar 

  15. Lee, P.H.: Computational haplotype analysis: An overview of computational methods in genetic variation study. Technical Report, -512, Queen’s University, Kingston, ON, Canada (2006), WEB URL: http://www.cs.queensu.ca/TechReports/Reports/2006-512.pdf

  16. Ramensky, V., Sunyaev, S.: Human non-synonymous SNPs: surver and survey. Nucleic Acid Research 30, 3894–3900 (2002)

    Article  Google Scholar 

  17. Ng, P., Henikoff, S.: Predicting deleterious amino acid substitutions. Genome Research 11, 863–874 (2001)

    Article  Google Scholar 

  18. Reumers, J., Schymkowitz, J., Ferkinghoff-Borg, J., Stricher, F., Serrano, L., Rousseau, F.: SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs. Nucleic Acid Research 33, D527–532 (2005)

    Article  Google Scholar 

  19. Yue, P., Melamud, E., Moult, J.: SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics 7, 166 (2006)

    Article  Google Scholar 

  20. Karchin, R., et al.: LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics 21, 2814–2820 (2005)

    Article  Google Scholar 

  21. Cartegni, L., Wang, J., Zhu, Z., Zhang, M.Q., Krainer, A.R.: ESEfinder: A web resource to identify exonic splicing enhancers. Nucleic Acids Research 31, 3568–3571 (2003)

    Article  Google Scholar 

  22. Yeo, G., Burge, C.B.: Variation in sequence and organization of splicing regulatory elements in vertebrate genes. Proceeding of Proc. Natl. Acad. Sci. 101(44), 15700–15705 (2004)

    Article  Google Scholar 

  23. Fairbrother, W.G., Yeh, R.F., Sharp, P.A., Burge, C.B.: Predictive identification of exonic splicing enhancers in human genes. Science 297, 1007–1013 (2002)

    Article  Google Scholar 

  24. Zhang, et al.: Exon inclusion is dependent on predictable exonic splicing enhancers. Molecular and Cellular Biology 25(16), 7323–7332 (2005)

    Article  Google Scholar 

  25. Akiyama, Y.: TFSEARCH: Searching Transcription Factor Binding Sites (1998), WEB URL: http://www.rwcp.or.jp/papia/

  26. Sandelin, A., Wasserman, W.W., Lenhard, B.: ConSite: web-based prediction of regulatory elements using cross-species comparison. Nucleic Acids Research 32, W249–252 (2004)

    Google Scholar 

  27. Hubbard, T.J.P., et al.: Ensembl, Nucleic Acids Research (Database issue) (2007)

    Google Scholar 

  28. Karolchik, D., et al.: The ucsc genome browser database. Nucl. Acids Res. 31(1), 51–54 (2003)

    Article  Google Scholar 

  29. Krawczak, M., Thomas, N.S., Hundrieser, B., Mort, M., Wittig, M., Hampe, J., Cooper, D.N.: Single base-pair substitutions in exon-intron junctions of human genes: nature, distribution, and consequences for mrna splicing. Human Mutation 28(2), 150–158 (2007)

    Article  Google Scholar 

  30. McKusick-Nathans Institute of Genetic Medicine, J.H.U., National Center for Biotechnology Information, N.L.o.M.: Online Mendelian Inheritance in Man, OMIM (TM). WEB URL: http://www.ncbi.nlm.nih.gov/omim/

  31. The International HapMap Consortium: The International HapMap Project. Nature 426, 789–796 (2003)

    Google Scholar 

  32. Hedrick, P.: Gametic disequilibrium measures: proceed with caution. Genetics 117, 331–341 (1987)

    Google Scholar 

  33. Lee, S.M.: Goal programming for decision analysis. Auerback, Philadelphia (1972)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Raffaele Giancarlo Sridhar Hannenhalli

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, P.H., Shatkay, H. (2007). Two Birds, One Stone: Selecting Functionally Informative Tag SNPs for Disease Association Studies. In: Giancarlo, R., Hannenhalli, S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science(), vol 4645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74126-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74126-8_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74125-1

  • Online ISBN: 978-3-540-74126-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics