Skip to main content

Combinatorial Methods for Disease Association Search and Susceptibility Prediction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4175))

Abstract

Accessibility of high-throughput genotyping technology makes possible genome-wide association studies for common complex diseases. When dealing with common diseases, it is necessary to search and analyze multiple independent causes resulted from interactions of multiple genes scattered over the entire genome. This becomes computationally challenging since interaction even of pairs gene variations require checking more than 1012 possibilities genome-wide. This paper first explores the problem of searching for the most disease-associated and the most disease-resistant multi-gene interactions for a given population sample of diseased and non-diseased individuals. A proposed fast complimentary greedy search finds multi-SNP combinations with non-trivially high association on real data. Exploiting the developed methods for searching associated risk and resistance factors, the paper addresses the disease susceptibility prediction problem. We first propose a relevant optimum clustering formulation and the model-fitting algorithm transforming clustering algorithms into susceptibility prediction algorithms. For three available real data sets (Crohn’s disease (Daly et al, 2001), autoimmune disorder (Ueda et al, 2003), and tick-borne encephalitis (Barkash et al, 2006)), the accuracies of the prediction based on the combinatorial search (respectively, 84%, 83%, and 89%) are higher by 15% compared to the accuracies of the best previously known methods. The prediction based on the complimentary greedy search almost matches the best accuracy but is much more scalable.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Affymetrix (2005), http://www.affymetrix.com/products/arrays/

  2. International HapMap Consortium, The International HapMap Project. Nature, 426, 789–796 (2003), http://www.hapmap.org

    Google Scholar 

  3. Daly, M., Rioux, J., Schaffner, S., Hudson, T., Lander, E.: High resolution haplotype structure in the human genome. Nature Genetics 29, 229–232 (2001)

    Article  Google Scholar 

  4. Barkhash, A., Perelygin, A., Brinza, D., Pilipenko, P., Bogdanova, Y.U., Romaschenko, A., Voevoda, M., Brinton, M.: Genetic Resistance to Flaviviruses. In: 5th Conf. on Bioinformatics of Genome Regulation and Structure (BGRS 2006) (to appear, 2006)

    Google Scholar 

  5. Brinza, D., Zelikovsky, A.: 2SNP: Scalable Phasing Based on 2-SNP Haplotypes. Bioinformatics 22(3), 371–373 (2006)

    Article  Google Scholar 

  6. Brinza, D., He, J., Zelikovsky, A.: Combinatorial Search Methods for Multi-SNP Disease Association. In: Brinza, D., He, J., Zelikovsky, A. (eds.) Proc. IEEE Conf. on Engineering in Medicine and Biology (EMBC 2006) (September 2006) (to appear)

    Google Scholar 

  7. Clark, A.G.: Finding Genes Underlying Risk of Complex Disease by Linkage Disequilibrium Mapping. Curr. Opin. Genet. Dev. 13(3), 296–302 (2003)

    Article  Google Scholar 

  8. Clark, A.G., et al.: Determinants of the success of whole-genome association testing. Genome Res. 15, 1463–1467 (2005)

    Article  Google Scholar 

  9. Stephens, M., Smith, N.J., Donnelly, P.: A New Statistical Method for Haplotype Reconstruction from Population Data. The American J. of Human Genetics 68, 978–998 (2001)

    Article  Google Scholar 

  10. Ueda, H., Howson, J.M.M., Esposito, L., et al.: Association of the T Cell Regulatory Gene CTLA4 with Susceptibility to Autoimmune Disease. Nature 423, 506–511 (2003)

    Article  Google Scholar 

  11. He, J., Zelikovsky, A.: Tag SNP Selection Based on Multivariate Linear Regression. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 750–757. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Marchini, J., Donnelley, P., Cardon, L.R.: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genetics 37, 413–417 (2005)

    Article  Google Scholar 

  13. Joachims, T.: http://svmlight.joachims.org/

  14. Breiman, L., Cutler, A.: http://www.stat.berkeley.edu/users/breiman/RF

  15. Mao, W., He, J., Brinza, D., Zelikovsky, A.: A Combinatorial Method for Predicting Genetic Susceptibility to Complex Diseases. In: Proc. IEEE Conf. on Engineering In Medicine and Biology (EMBC 2005), pp. 224–227 (2005)

    Google Scholar 

  16. Mao, W., Brinza, D., Hundewale, N., Gremalschi, S., Zelikovsky, A.: Genotype Susceptibility and Integrated Risk Factors for Complex Diseases. In: Proc. IEEE Conf. on Granular Computing (GRC 2006), pp. 754–757 (2006)

    Google Scholar 

  17. Kimmel, G., Shamir, R.: A Block-Free Hidden Markov Model for Genotypes and Its Application to Disease Association. J. of Computational Biology 12(10), 1243–1260 (2005)

    Article  Google Scholar 

  18. Listgarten, J., Damaraju, S., Poulin, B., Cook, L., Dufour, J., Driga, A., Mackey, J., Wishart, D., Greiner, R., Zanke, B.: Predictive Models for Breast Cancer Susceptibility from Multiple Single Nucleotide Polymorphisms. Clinical Cancer Research 10, 2725–2737 (2004)

    Article  Google Scholar 

  19. Nelson, M.R., Kardia, S.L., Ferrell, R.E., Sing, C.F.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11, 458–470 (2001)

    Article  Google Scholar 

  20. Tahri-Daizadeh, N., Tregouet, D.A., Nicaud, V., Manuel, N., Cambien, F., Tiret, L.: Automated detection of informative combined effects in genetic association studies of complex traits. Genome Res. 13, 1952–1960 (2003)

    Google Scholar 

  21. Tomita, Y., Yokota, M., Honda, H.: Classification method for prediction of multifactorial disease development using interaction between genetic and environmental factors. In: IEEE Comput. Systems Bioinformatics Conf. CSB 2005, poster (2005)

    Google Scholar 

  22. Waddell, M., Page, D., Zhan, F., Barlogie, B., Shaughnessy, J.: Predicting Cancer Susceptibility from SingleNucleotide Polymorphism Data: A Case Study in Multiple Myeloma. In: Proc. BIOKDD 2005 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brinza, D., Zelikovsky, A. (2006). Combinatorial Methods for Disease Association Search and Susceptibility Prediction. In: Bücher, P., Moret, B.M.E. (eds) Algorithms in Bioinformatics. WABI 2006. Lecture Notes in Computer Science(), vol 4175. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11851561_27

Download citation

  • DOI: https://doi.org/10.1007/11851561_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39583-6

  • Online ISBN: 978-3-540-39584-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics