Skip to main content

EpIntMC: Detecting Epistatic Interactions Using Multiple Clusterings

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12304))

Abstract

Detecting epistatic interaction between multiple single nucleotide polymorphisms (SNPs) is crucial to identify susceptibility genes associated with complex human diseases. Stepwise search approaches have been extensively studied to greatly reduce the search space for follow-up SNP interactions detection. However, most of these stepwise methods are prone to filter out significant polymorphism combinations and thus have a low detection power. In this paper, we propose a two-stage approach called EpIntMC, which uses multiple clusterings to significantly shrink the search space and reduce the risk of filtering out significant combinations for the follow-up detection. EpIntMC firstly introduces a matrix factorization based approach to generate multiple diverse clusterings to group SNPs into different clusters from different aspects, which helps to more comprehensively explore the genotype data and reduce the chance of filtering out potential candidates overlooked by a single clustering. In the search stage, EpIntMC applies Entropy score to screen SNPs in each cluster, and uses Jaccard similarity to merge the most similar clusters into candidate sets. After that, EpIntMC uses exhaustive search on these candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpIntMC has a higher (comparable) power than related competitive solutions, and results on Wellcome Trust Case Control Consortium (WTCCC) dataset also expresses its effectiveness.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abdulrashid, K., AlHussaini, N., Ahmed, W., Thalib, L.: Prevalence of BRCA mutations among hereditary breast and/or ovarian cancer patients in Arab countries: systematic review and meta-analysis. BMC Cancer 19(1), 256 (2019). https://doi.org/10.1186/s12885-019-5463-1

    Article  PubMed  PubMed Central  Google Scholar 

  2. Albatineh, A.N., Niewiadomska-Bugaj, M.: Correcting Jaccard and other similarity indices for chance agreement in cluster analysis. Adv. Data Anal. Classif. 5(3), 179–200 (2011). https://doi.org/10.1007/s11634-011-0090-y

    Article  Google Scholar 

  3. Bailey, J.: Alternative clustering analysis: a review. In: Data Clustering, pp. 535–550. Chapman and Hall/CRC (2018)

    Google Scholar 

  4. Balding, D.J.: A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7(10), 781 (2006)

    CAS  PubMed  Google Scholar 

  5. Bermejo, J.L., et al.: Exploring the association between genetic variation in the SUMO isopeptidase gene USPL1 and breast cancer through integration of data from the population-based genica study and external genetic databases. Int. J. Cancer 133(2), 362–372 (2013)

    CAS  PubMed  Google Scholar 

  6. Burton, P.R., et al.: Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nat. Genet. 39(11), 1329 (2007)

    CAS  PubMed  Google Scholar 

  7. Cao, X., Yu, G., Liu, J., Jia, L., Wang, J.: ClusterMI: detecting high-order SNP interactions based on clustering and mutual information. Int. J. Mol. Sci. 19(8), 2267 (2018)

    PubMed Central  Google Scholar 

  8. Cao, X., Yu, G., Ren, W., Guo, M., Wang, J.: DualWMDR: detecting epistatic interaction with dual screening and multifactor dimensionality reduction. Hum. Mutat. 40, 719–734 (2020)

    Google Scholar 

  9. Chattopadhyay, A.S., Hsiao, C.L., Chang, C.C., Lian, I.B., Fann, C.S.: Summarizing techniques that combine three non-parametric scores to detect disease-associated 2-way SNP-SNP interactions. Gene 533(1), 304–312 (2014)

    Google Scholar 

  10. Culverhouse, R., Suarez, B.K., Lin, J., Reich, T.: A perspective on epistasis: limits of models displaying no main effect. Am. J. Hum. Genet. 70(2), 461–471 (2002)

    PubMed  PubMed Central  Google Scholar 

  11. Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32(1), 45–55 (2010)

    Google Scholar 

  12. Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)

    Google Scholar 

  13. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Google Scholar 

  14. Lee, H., Goodarzi, H., Tavazoie, S.F., Alarcón, C.R.: TMEM2 is a SOX4-regulated gene that mediates metastatic migration and invasion in breast cancer. Cancer Res. 76(17), 4994–5005 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Li, W., Reich, J.: A complete enumeration and classification of two-locus disease models. Hum. Hered. 50(6), 334–349 (2000)

    CAS  PubMed  Google Scholar 

  16. Liu, J., Yu, G., Jiang, Y., Wang, J.: HiSeeker: detecting high-order SNP interactions based on pairwise SNP combinations. Genes 8(6), 153 (2017)

    PubMed Central  Google Scholar 

  17. Ma, L., Runesha, H.B., Dvorkin, D., Garbe, J.R., Da, Y.: Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies. BMC Bioinform. 9(1), 315 (2008). https://doi.org/10.1186/1471-2105-9-315

    Article  CAS  Google Scholar 

  18. Mackay, T.F., Moore, J.H.: Why epistasis is important for tackling complex human disease genetics. Genome Med. 6(6), 42 (2014). https://doi.org/10.1186/gm561

    Article  PubMed Central  Google Scholar 

  19. Marchini, J., Donnelly, P., Cardon, L.R.: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37(4), 413 (2005)

    CAS  PubMed  Google Scholar 

  20. Moore, J.H., Asselbergs, F.W., Williams, S.M.: Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4), 445–455 (2010)

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Niel, C., Sinoquet, C., Dina, C., Rocheleau, G.: A survey about methods dedicated to epistasis detection. Front. Genet. 6, 285 (2015)

    PubMed  PubMed Central  Google Scholar 

  22. Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Sun, K., et al.: Oxidized ATM-mediated glycolysis enhancement in breast cancer-associated fibroblasts contributes to tumor invasion through lactate as metabolic coupling. EBioMedicine 41, 370–383 (2019)

    PubMed  PubMed Central  Google Scholar 

  24. Vivekanandhan, S., Mukhopadhyay, D.: Divergent roles of Plexin D1 in cancer. Biochimica et Biophysica Acta (BBA)-Rev. Cancer 1872(1), 103–110 (2019)

    Google Scholar 

  25. Wan, X., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Wang, J., Wang, X., Yu, G., Domeniconi, C., Yu, Z., Zhang, Z.: Discovering multiple co-clusterings with matrix factorization. IEEE Trans. Cybern. 99(1), 1–14 (2020)

    Google Scholar 

  27. Wang, X., Wang, J., Domeniconi, C., Yu, G., Xiao, G., Guo, M.: Multiple independent subspace clusterings. In: AAAI, pp. 5353–5360 (2019)

    Google Scholar 

  28. Wang, Y., Liu, X., Robbins, K., Rekaya, R.: AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm. BMC Res. Notes 3(1), 117 (2010). https://doi.org/10.1186/1756-0500-3-117

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wei, S., Wang, J., Yu, G., Zhang, X., et al.: Multi-view multiple clusterings using deep matrix factorization. In: AAAI, pp. 1–8 (2020)

    Google Scholar 

  30. Welter, D., et al.: The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42(D1), D1001–D1006 (2013)

    PubMed  PubMed Central  Google Scholar 

  31. Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2011)

    PubMed  PubMed Central  Google Scholar 

  32. Yang, C.H., Chuang, L.Y., Lin, Y.D.: CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies. Bioinformatics 33(15), 2354–2362 (2017)

    CAS  PubMed  Google Scholar 

  33. Yang, C.H., Chuang, L.Y., Lin, Y.D.: Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions. Bioinformatics 34(13), 2228–2236 (2018)

    CAS  PubMed  Google Scholar 

  34. Yao, S., Yu, G., Wang, J., Domeniconi, C., Zhang, X.: Multi-view multiple clustering. In: IJCAI, pp. 4121–4127 (2019)

    Google Scholar 

  35. Yao, S., Yu, G., Wang, X., Wang, J., Domeniconi, C., Guo, M.: Discovering multiple co-clusterings in subspaces. In: SDM, pp. 423–431 (2019)

    Google Scholar 

  36. Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167 (2007)

    CAS  PubMed  Google Scholar 

  37. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. B 67(2), 301–320 (2005)

    Google Scholar 

Download references

Acknowledgements

This research is supported by NSFC (61872300), Fundamental Research Funds for the Central Universities (XDJK2020B028 and XDJK2019B024), Natural Science Foundation of CQ CSTC (cstc2018jcyjAX0228).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, H., Yu, G., Ren, W., Guo, M., Wang, J. (2020). EpIntMC: Detecting Epistatic Interactions Using Multiple Clusterings. In: Cai, Z., Mandoiu, I., Narasimhan, G., Skums, P., Guo, X. (eds) Bioinformatics Research and Applications. ISBRA 2020. Lecture Notes in Computer Science(), vol 12304. Springer, Cham. https://doi.org/10.1007/978-3-030-57821-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57821-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57820-6

  • Online ISBN: 978-3-030-57821-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics