Skip to main content

Advertisement

Log in

Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

After the completion of Human Genome Project in 2003, it is now possible to associate genetic variations in the human genome with common and complex diseases. The current challenge now is to utilize the genomic data efficiently and to develop tools to improve our understanding of etiology of complex diseases. Many of the algorithms needed to deal with this task were originally developed in management science and operations research (OR). One application is to select a subset of the Single Nucleotide Polymorphism (SNP) biomarkers from the whole SNP set that is informative and small enough for subsequent association studies. In this paper, we present an OR application for representative SNP selection that implements our novel Simulated Annealing (SA) based feature-selection algorithm. We hope that our work will facilitate reliable identification of SNPs that are involved in the etiology of complex diseases and ultimately support timely identification of genomic disease biomarkers and the development of personalized-medicine approaches and targeted drug discoveries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Alazamir, S., Rebennack, S., Pardalos, P.M.: Improving the neighborhood selection strategy in simulated annealing using optimal stopping problem. In: Tan, C.M. (ed.) Global Optimization: Focus on Simulated Annealing. Energy Systems, pp. 363–382. I-Tech Education and Publication (2008)

  2. Bafna, V., Halldorsson, B.V., Schwartz, R., Clark, A.G.: Haplotypes and informative SNP selection algorithms: don’t block out information. In: Proceedings of the Seventh International Conference on Research in Computational Molecular Biology (2003)

  3. Daly M.J., Rioux J.D., Schaffner S.F., Hudson T.J., Lander E.S.: High resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001)

    Article  Google Scholar 

  4. Floudas C., Pardalos P.M.: Optimization in Computational Chemistry and Molecular Biology—Local and Global Approaches. Kluwer, Dordrecht (2000)

    MATH  Google Scholar 

  5. Halperin E., Kimmel G., Shamir R.: Tag SNP selection in genotype data for maximizing SNP prediction accuracy. Bioinformatics 21, 195–203 (2005)

    Article  Google Scholar 

  6. Hampe J., Schreiber S., Krawczak M.: Entropy-based SNP selection for genetic association studies. Hum. Genet. 114, 36–43 (2003)

    Article  Google Scholar 

  7. Horne B., Camp N.J.: Principal component analysis for selection of optimal SNP-sets that capture intragenic genetic variation. Genet. Epidemiol. 26, 11–21 (2004)

    Article  Google Scholar 

  8. Howie B., Carlson C., Rieder M., Nickerson D.: Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Hum. Genet. 120, 58–68 (2006)

    Article  Google Scholar 

  9. Ke X., Cardon L.R.: Efficient selective screening of haplotype tag SNPs. Bioinformatics 19, 287–288 (2003)

    Article  Google Scholar 

  10. Kirkpatrick S., Gelatt C.D., Vecchi M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  11. Kruglyak L., Nickerson D.A.: Variation is the spice of life. Nat. Genet. 27, 234–236 (2001)

    Article  Google Scholar 

  12. Liu G., Wang Y., Wong L.: Fasttagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium. BMC Bioinformatics 11, 66 (2010)

    Article  Google Scholar 

  13. Liu L., Wu Y., Lonardi S., Jiang T.: Efficient genome-wide tagsnp selection across populations via the linkage disequilibrium criterion. J. Comput. Biol. (J. Computat. Mol. Cell Biol.) 17, 21–37 (2010)

    Google Scholar 

  14. Mondaini R., Pardalos P.M.: Mathematical modelling of biosystems. Springer, Berlin (2001)

    Google Scholar 

  15. Saccone S., Bolze R., Thomas P., Quan J., Mehta G., Deelman E., Tischfield J., Rice J.: Spot: a web-based tool for using biological databases to prioritize SNPs after a genome-wide association study. Nucleic Acids Res. 38, 201–209 (2010)

    Article  Google Scholar 

  16. Shastry B.S.: SNPs in disease gene mapping, medicinal drug development and evolution. J. Hum. Genet. 52, 871–880 (2007)

    Article  Google Scholar 

  17. Weale M.: Quality control for genome-wide association studies. Methods Mol. Biol. 628, 341–372 (2010)

    Article  Google Scholar 

  18. Xu Z., Taylor J.: SNPinfo: integrating gwas and candidate gene information into functional SNP selection for genetic association studies. Nucleic Acids Res. 37, 600–605 (2009)

    Article  Google Scholar 

  19. Zhang K., Qin Z., Chen T., Liu J.S., Waterman M.S., Sun F.: Hapblock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21, 131–134 (2005)

    Article  Google Scholar 

  20. Zhang P., Sheng H., Uehara R.: A double classification tree search algorithm for index SNP selection. BMC Bioinformatics 5, 89 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gürkan Üstünkar.

Additional information

For the Alzheimer’s Disease Neuroimaging Initiative: Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.ucla.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.ucla.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Üstünkar, G., Özöğür-Akyüz, S., Weber, G.W. et al. Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach. Optim Lett 6, 1207–1218 (2012). https://doi.org/10.1007/s11590-011-0419-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-011-0419-7

Keywords

Navigation