Skip to main content

Advertisement

Log in

Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The discovery of higher-order epistatic interactions is an important task in the field of genome wide association studies which allows for the identification of complex interaction patterns between multiple genetic markers. Some existing bruteforce approaches explore the whole space of k-interactions in an exhaustive manner resulting in almost intractable execution times. Computational cost can be reduced drastically by restricting the search space with suitable preprocessing filters which prune unpromising candidates. Other approaches mitigate the execution time by employing massively parallel accelerators in order to benefit from the vast computational resources of these architectures. In this paper, we combine a novel preprocessing filter, namely SingleMI, with massively parallel computation on modern GPUs to further accelerate epistasis discovery. Our implementation improves both the runtime and accuracy when compared to a previous GPU counterpart that employs mutual information clustering for prefiltering. SingleMI is open source software and publicly available at: https://github.com/sleeepyjack/singlemi/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Buckles, B.P., Lybanon, M.: Algorithm 515: generation of a vector from the lexicographical index [G6]. ACM Trans. Math. Softw. 3(2), 180–182 (1977)

    Article  Google Scholar 

  2. Cattaert, T., Calle, M.L., Dudek, S.M., Hohn, J.M., Lishout, F.V., Urrea, V., Ritchie, M.D., Steel, K.V.: Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Ann. Hum. Genet. 75(1), 78–89 (2011)

    Article  Google Scholar 

  3. Cordell, H.J.: Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11(20), 2463–2468 (2002)

    Article  Google Scholar 

  4. Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)

    Article  Google Scholar 

  5. Culverhouse, R.: The use of the restricted partition method with case-control data. Hum. Hered. 63(2), 93–100 (2007)

    Article  Google Scholar 

  6. Duane Merrill, N.C.: CUB documentation. https://nvlabs.github.io/cub/ (2016)

  7. Easton, D.F., Pooley, K.A., et al.: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447(7148), 1087–1093 (2007)

    Article  Google Scholar 

  8. Frayling, T.M., Timpson, N.J., et al.: A common variant in the fto gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316(5826), 889–894 (2007)

    Article  Google Scholar 

  9. González-Domínguez, J., Schmidt, B.: GPU-accelerated exhaustive search for third-order epistatic interactions in case-control studies. J. Comput. Sci. 8, 93–100 (2015)

    Article  Google Scholar 

  10. González-Domínguez, J., Ramos, S., Touriño, J., Schmidt, B.: Parallel pairwise epistasis detection on heterogeneous computing architectures. IEEE Trans. Parallel Distrib. Syst. 27(8), 2329–2340 (2016)

    Article  Google Scholar 

  11. Goudey, B., Abedini, M., Hopper, J.L., Inouye, M., Makalic, E., Schmidt, D.F., Wagner, J., Zhou, Z., Zobel, J., Reumann, M.: High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in genome wide association studies. Health Inf. Sci. Syst. 3(Suppl 1), S3 (2015)

    Article  Google Scholar 

  12. Gui, J., Andrew, A.S., Andrews, P., et al.: A robust multifactor dimensionality reduction method for detecting gene-gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann. Hum. Genet. 75(1), 20–28 (2011)

    Article  Google Scholar 

  13. Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)

    Article  Google Scholar 

  14. Hu, X., Liu, Q., Zhang, Z., Li, Z., Wang, S., He, L., Shi, Y.: SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 20(7), 854–857 (2010)

    Article  Google Scholar 

  15. Jünger, D.: CUDA batch reduce primitive. https://github.com/sleeepyjack/batchreduce (2016)

  16. Jünger, D., Hundt, C., González-Domínguez, J., Schmidt, B.: Ultra-fast detection of higher-order epistatic interactions on gpus. In: 4th International Workshop on Parallelism in Bioinformatics (PBio 2016), Grenoble, France (2016)

  17. Kam-Thong, T., Czamara, D., Tsuda, K., Borgwardt, K., Lewis, C., Erhardt-Lehmann, A., Hemmer, B., Rieckmann, P., Daake, M., Weber, F., Wolf, C., Ziegler, A., Pütz, B., Holsboer, F., Schölkopf, B., Müller-Myhsok, B.: EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units. Eur. J. Hum. Genet. 19(4), 465–471 (2011)

    Article  Google Scholar 

  18. Kässens, J.C., Wienbrandt, L., González-Domínguez, J., Schmidt, B., Schimmler, M.: High-speed exhaustive 3-locus interaction epistasis analysis on FPGAs. J. Comput. Sci. 9, 131–136 (2015)

    Article  Google Scholar 

  19. Leem, S., Jeong, H.H., et al.: Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure. Comput. Biol. Chem. 50, 19–28 (2014)

    Article  MathSciNet  Google Scholar 

  20. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  21. Meng, Y.A., Yu, Y., Cupples, L.A., Farrer, L.A., Lunetta, K.L.: Performance of random forest when SNPs are in linkage disequilibrium. BMC Bioinform. 10(1), 1 (2009)

    Article  Google Scholar 

  22. Nelson, M.R., Kardia, S.L., Ferrel, L.E., Sing, C.F.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11(3), 458–470 (2001)

    Article  Google Scholar 

  23. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J., Sham, P.C.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)

    Article  Google Scholar 

  24. Ritchie Lab: genomeSIMLA software. https://ritchielab.psu.edu/software/genomesimla-download (2016)

  25. Sluga, D., Curk, T., Zupan, B., Lotric, U.: Heterogeneous computing architecture for fast detection of SNP-SNP interactions. BMC Bioinform. 15(1), 216 (2014)

    Article  Google Scholar 

  26. Tuo, S., Zhang, J., Yuan, X., Zhang, Y., Liu, Z.: FHSA-SED: two-locus model detection for genome-wide association study with harmony search algorithm. PLoS ONE 11(3), 1–27 (2016)

    Article  Google Scholar 

  27. Wan, X., Yang, C., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)

    Article  Google Scholar 

  28. Wan, X., Yang, C., et al.: Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics 26(1), 30–37 (2010)

    Article  Google Scholar 

  29. Wang, Y., Liu, G., Feng, M., Wong, L.: An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics 27(21), 2936–2943 (2011)

    Article  Google Scholar 

  30. Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2012)

    Article  Google Scholar 

  31. Yang, Y., Houle, A.M., Letendre, J., Richter, A.: RET Gly691ser mutation is associated with primary vesicoureteral reflux in the French-Canadian population from Quebec. Hum. Mutat. 29(5), 695–702 (2008)

    Article  Google Scholar 

  32. Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Weichuan, Y.: SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)

    Article  Google Scholar 

  33. Yung, L.S., Yang, C., Wan, X., Yu, W.: GBOOST: a GPU-based tool for detecting genegene interactions in genomewide case control studies. Bioinformatics 27(9), 1309–1310 (2011)

    Article  Google Scholar 

  34. Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge González Domínguez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jünger, D., Hundt, C., Domínguez, J.G. et al. Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs. Cluster Comput 20, 1899–1908 (2017). https://doi.org/10.1007/s10586-017-0938-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-0938-9

Keywords

Navigation