skip to main content
10.1145/3292500.3330912acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

GCN-MF: Disease-Gene Association Identification By Graph Convolutional Networks and Matrix Factorization

Published: 25 July 2019 Publication History

Abstract

Discovering disease-gene association is a fundamental and critical biomedical task, which assists biologists and physicians to discover pathogenic mechanism of syndromes. With various clinical biomarkers measuring the similarities among genes and disease phenotypes, network-based semi-supervised learning (NSSL) has been commonly utilized by these studies to address this class-imbalanced large-scale data issue. However, most existing NSSL approaches are based on linear models and suffer from two major limitations: 1) They implicitly consider a local-structure representation for each candidate; 2) They are unable to capture nonlinear associations between diseases and genes. In this paper, we propose a new framework for disease-gene association task by combining Graph Convolutional Network (GCN) and matrix factorization, named GCN-MF. With the help of GCN, we could capture non-linear interactions and exploit measured similarities. Moreover, we define a margin control loss function to reduce the effect of sparsity. Empirical results demonstrate that the proposed deep learning algorithm outperforms all other state-of-the-art methods on most of metrics.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: a system for large-scale machine learning. In USENIX Symposium on Operating Systems Design and Implementation, Vol. 16. 265--283.
[2]
Euan A Adie, Richard R Adams, Kathryn L Evans, David J Porteous, and Ben S Pickard. 2006. SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22, 6 (2006), 773--774.
[3]
Joanna S Amberger, Carol A Bocchini, François Schiettecatte, Alan F Scott, and Ada Hamosh. 2014. OMIM. org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic acids research 43, D1 (2014), D789--D798.
[4]
Gary D Bader, Doron Betel, and Christopher WV Hogue. 2003. BIND: the biomolecular interaction network database. Nucleic acids research 31, 1 (2003), 248--250.
[5]
Albert-László Barabási, Natali Gulbahce, and Joseph Loscalzo. 2011. Network medicine: a network-based approach to human disease. Nature reviews genetics 12, 1 (2011), 56.
[6]
Robert M. Bell and Yehuda Koren. 2007. Lessons from the Netflix prize challenge. SIGKDD Explorations (2007).
[7]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018).
[8]
Jianfei Chen, Jun Zhu, and Le Song. 2018. Stochastic Training of Graph Convolutional Networks with Variance Reduction. In International Conference on Machine Learning. 941--949.
[9]
Joshua C Denny, Marylyn D Ritchie, Melissa A Basford, Jill M Pulley, Lisa Bastarache, Kristin Brown-Gentry, Deede Wang, Dan R Masys, Dan M Roden, and Dana C Crawford. 2010. PheWAS: demonstrating the feasibility of a phenomewide scan to discover gene--disease associations. Bioinformatics 26, 9 (2010), 1205--1210.
[10]
Kyle J Gaulton, Karen L Mohlke, and Todd J Vision. 2007. A computational system to select candidate genes for complex human traits. Bioinformatics 23, 9 (2007), 1132--1140.
[11]
Renu Goel, HC Harsha, Akhilesh Pandey, and TS Keshava Prasad. 2012. Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. Molecular bioSystems 8, 2 (2012), 453--463.
[12]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024--1034.
[13]
David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. 2011. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis 30, 2 (2011), 129--150.
[14]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 173--182.
[15]
Kristina M Hettne, Mark Thompson, Herman HHBM van Haagen, Eelke Van Der Horst, Rajaram Kaliyaperumal, Eleni Mina, Zuotian Tatum, Jeroen FJ Laros, Erik M Van Mulligen, Martijn Schuemie, et al. 2016. The implicitome: a resource for rationalizing gene-disease associations. PloS one 11, 2 (2016), e0149621.
[16]
Yifan Hu, Florham Park, Yehuda Koren, and Chris Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In IEEE International Conference on Data Mining.
[17]
Trey Ideker and Roded Sharan. 2008. Protein networks in disease. Genome research 18, 4 (2008), 644--652.
[18]
G Joshi-Tope, Marc Gillespie, Imre Vastrik, Peter D'Eustachio, Esther Schmidt, Bernard de Bono, Bijay Jassal, GR Gopinath, GR Wu, Lisa Matthews, et al. 2005. Reactome: a knowledgebase of biological pathways. Nucleic acids research 33, suppl_1 (2005), D428--D432.
[19]
Minoru Kanehisa, Susumu Goto, Yoko Sato, Masayuki Kawashima, Miho Furumichi, and Mao Tanabe. 2013. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic acids research 42, D1 (2013), D199--D205.
[20]
Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1 (1953), 39--43.
[21]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[22]
Thomas N Kipf and MaxWelling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[23]
Yehuda Koren, Robert M. Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer (2009).
[24]
Insuk Lee, U Martin Blom, Peggy IWang, Jung Eun Shim, and EdwardMMarcotte. 2011. Prioritizing candidate disease genes by network-based boosting of genomewide association data. Genome research (2011), gr--118992.
[25]
Yu Li, Hiroyuki Kuwahara, Peng Yang, Le Song, and Xin Gao. 2019. PGCN: Disease gene prioritization by disease and gene embedding through graph convolutional neural networks. bioRxiv (2019). arXiv:https://www.biorxiv.org/content/early/2019/01/28/532226.full.pdf
[26]
Yongjin Li and Jagdish C Patra. 2010. Genome-wide inferring gene--phenotype relationship by walking on the heterogeneous network. Bioinformatics 26, 9 (2010), 1219--1224.
[27]
Yong Liu, Min Wu, Chenghao Liu, Xiaoli Li, and Jie Zheng. 2019. SL 2 MF: Predicting Synthetic Lethality in Human Cancers via Logistic Matrix Factorization. IEEE/ACM transactions on computational biology and bioinformatics (2019).
[28]
Yong Liu, Min Wu, Chunyan Miao, Peilin Zhao, and Xiao-Li Li. 2016. Neighborhood regularized logistic matrix factorization for drug-target interaction prediction. PLoS computational biology 12, 2 (2016), e1004760.
[29]
Zhiwu Lu, Zhenyong Fu, Tao Xiang, Peng Han, Liwei Wang, and Xin Gao. 2017. Learning fromweak and noisy labels for semantic segmentation. IEEE transactions on pattern analysis and machine intelligence 39, 3 (2017), 486--500.
[30]
Nagarajan Natarajan and Inderjit S Dhillon. 2014. Inductive matrix completion for predicting gene--disease associations. Bioinformatics 30, 12 (2014), i60--i68.
[31]
Arzucan Özgür, Thuy Vu, Güne? Erkan, and Dragomir R Radev. 2008. Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics 24, 13 (2008), i277--i285.
[32]
Rong Pan, Yunhong Zhou, Bin Cao, Nathan Nan Liu, Rajan Lukose, Martin Scholz, and Qiang Yang. 2008. One-class collaborative filtering. In IEEE International Conference on Data Mining. IEEE, 502--511.
[33]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In Conference on Uncertainty in Artificial Intelligence. 452--461.
[34]
U Martin Singh-Blom, Nagarajan Natarajan, Ambuj Tewari, John O Woods, Inderjit S Dhillon, and Edward M Marcotte. 2013. Prediction and validation of gene-disease associations using methods inspired by social network analyses. PloS one 8, 5 (2013), e58977.
[35]
Qi Wang, Mengying Sun, Liang Zhan, Paul Thompson, Shuiwang Ji, and Jiayu Zhou. 2017. Multi-Modality Disease Modeling via Collective Deep Matrix Factorization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1155--1164.
[36]
Xiujuan Wang, Natali Gulbahce, and Haiyuan Yu. 2011. Network-based methods for human disease gene prediction. Briefings in functional genomics 10, 5 (2011), 280--293.
[37]
Xuebing Wu, Rui Jiang, Michael Q Zhang, and Shao Li. 2008. Network-based global inference of human disease genes. Molecular systems biology 4, 1 (2008), 189.
[38]
Peng Yang, Xiaoli Li, Min Wu, Chee-Keong Kwoh, and See-Kiong Ng. 2011. Inferring gene-phenotype associations via global protein complex network propagation. PloS one 6, 7 (2011), e21502.
[39]
Peng Yang, Xiao-Li Li, Jian-Ping Mei, Chee-Keong Kwoh, and See-Kiong Ng. 2012. Positive-unlabeled learning for disease gene identification. Bioinformatics 28, 20 (2012), 2640--2647.
[40]
Peng Yang, Peilin Zhao, Yong Liu, and Xin Gao. 2018. Robust Cost-Sensitive Learning for Recommendation with Implicit Feedback. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 621--629.
[41]
Peng Yang, Peilin Zhao, Vincent W Zheng, Lizhong Ding, and Xin Gao. 2018. Robust Asymmetric Recommendation via Min-Max Optimization. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 1077--1080.
[42]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. arXiv preprint arXiv:1806.01973 (2018).
[43]
Hongyi Zhou and Jeffrey Skolnick. 2016. A knowledge-based approach for predicting gene--disease associations. Bioinformatics 32, 18 (2016), 2831--2838.

Cited By

View all
  • (2025)Drug Repurposing: Insights into Current Advances and Future ApplicationsCurrent Medicinal Chemistry10.2174/010929867326647023102311084132:3(468-510)Online publication date: Jan-2025
  • (2025)Therapeutic gene target prediction using novel deep hypergraph representation learningBriefings in Bioinformatics10.1093/bib/bbaf01926:1Online publication date: 22-Jan-2025
  • (2025)Computational approaches for predicting drug-disease associations: a comprehensive reviewFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40072-y19:5Online publication date: 1-May-2025
  • Show More Cited By

Index Terms

  1. GCN-MF: Disease-Gene Association Identification By Graph Convolutional Networks and Matrix Factorization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2019
    3305 pages
    ISBN:9781450362016
    DOI:10.1145/3292500
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep learning
    2. disease-gene association
    3. graph convolutional networks

    Qualifiers

    • Research-article

    Conference

    KDD '19
    Sponsor:

    Acceptance Rates

    KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)175
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Drug Repurposing: Insights into Current Advances and Future ApplicationsCurrent Medicinal Chemistry10.2174/010929867326647023102311084132:3(468-510)Online publication date: Jan-2025
    • (2025)Therapeutic gene target prediction using novel deep hypergraph representation learningBriefings in Bioinformatics10.1093/bib/bbaf01926:1Online publication date: 22-Jan-2025
    • (2025)Computational approaches for predicting drug-disease associations: a comprehensive reviewFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40072-y19:5Online publication date: 1-May-2025
    • (2024)Effective Tool Augmented Multi-Agent Framework for Data AnalysisData Intelligence10.3724/2096-7004.di.2024.0013Online publication date: 17-Oct-2024
    • (2024)VGE: Gene-Disease Association by Variational Graph EmbeddingInternational Journal of Crowd Science10.26599/IJCS.2024.91000048:2(95-99)Online publication date: May-2024
    • (2024)A Fast Nonnegative Autoencoder-Based Approach to Latent Feature Analysis on High-Dimensional and Incomplete DataIEEE Transactions on Services Computing10.1109/TSC.2023.331971317:3(733-746)Online publication date: May-2024
    • (2024)Generative Essential Graph Convolutional Network for Multi-View Semi-Supervised ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.337457926(7987-7999)Online publication date: 7-Mar-2024
    • (2024)Graph Representation Learning Based on Specific Subgraphs for Biomedical Interaction PredictionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2024.340274121:5(1552-1564)Online publication date: Sep-2024
    • (2024)A Plug-In Graph Neural Network to Boost Temporal Sensitivity in fMRI AnalysisIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.341500028:9(5323-5334)Online publication date: Sep-2024
    • (2024)Research on Online Shopping Mall Product Classification Recommendation Based on Graph Neural Network2024 IEEE 3rd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA)10.1109/EEBDA60612.2024.10485996(297-301)Online publication date: 27-Feb-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media