Skip to main content

Inferring Drug-miRNA Associations by Integrating Drug SMILES and MiRNA Sequence Information

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12464))

Included in the following conference series:

Abstract

The accumulated evidences indicate that drugs not only interact with proteins, but also regulate a wide variety of biomarkers such as miRNAs. Hence, uncovering potential drug-miRNA associations plays significant roles in disease prevention, diagnosis and treatment as well as drug development. In this paper, we discuss how this problem is formulated as a link prediction task in a bipartite graph and construct a computational model to infer unknown drug-miRNA associations. Specifically, the drug SMILES (Simplified molecular input line entry specification) or miRNA sequences can be regarded as a kind of biology language described by distributed representation. The experiment verified associations are treated as positive samples and the same number unlabeled associations are randomly selected as negative samples. Finally, Random Forest classifier is applied to perform the prediction task. In the experiment, the proposed method achieves AUROC of 91.16 and AUPR of 89.21 under 5-fold cross-validation. It demonstrates the great potential of seamless integration of deep learning and biological big data. We hope that this research with great expectations can be used as a practical guidance tool to bring useful inspiration to relevant researchers.

Z. Guo, Z. You—These authors contributed equally to this work

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, X., Gong, Y., Zhang, D.H., You, Z.H., Li, Z.W.: DRMDA: deep representations-based miRNA–disease association prediction. J. Cellul. Mol. Med. 22(1), 472–485 (2018)

    Article  Google Scholar 

  2. Chen, X., Huang, Y.-A., You, Z.-H., Yan, G.-Y., Wang, X.-S.: A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases. Bioinformatics 33(5), 733–739 (2017)

    Google Scholar 

  3. Chen, X., et al.: A novel computational model based on super-disease and miRNA for potential miRNA–disease association prediction. Mol. BioSyst. 13(6), 1202–1212 (2017)

    Article  Google Scholar 

  4. Chen, X., Liu, M.-X., Yan, G.-Y.: Drug–target interaction prediction by random walk on the heterogeneous network. Mol. BioSyst. 8(7), 1970–1978 (2012)

    Article  Google Scholar 

  5. Chen, X., Wang, C.-C., Yin, J., You, Z.-H.: Novel human miRNA-disease association inference based on random forest. Mol. Therap. Nucl. Acids 13, 568–579 (2018)

    Article  Google Scholar 

  6. Chen, X., Xie, D., Wang, L., Zhao, Q., You, Z.-H., Liu, H.: BNPMDA: bipartite network projection for MiRNA–disease association prediction. Bioinformatics 34(18), 3178–3186 (2018)

    Article  Google Scholar 

  7. Chen, X., Xie, D., Zhao, Q., You, Z.-H.: MicroRNAs and complex diseases: from experimental results to computational models. Brief. Bioinf. 20(2), 515–539 (2019)

    Article  Google Scholar 

  8. Chen, X., Yan, C.C., Zhang, X., You, Z.-H.: Long non-coding RNAs and complex diseases: from experimental results to computational models. Brief. Bioinf. 18(4), 558–576 (2016). https://doi.org/10.1093/bib/bbw060

    Article  Google Scholar 

  9. Chen, X., et al.: WBSMDA: within and between score for MiRNA-disease association prediction. Sci. Rep. 6(1), 21106 (2016). https://doi.org/10.1038/srep21106

    Article  Google Scholar 

  10. Chen, X., Yan, C.C., Zhang, X., You, Z.-H., Huang, Y.-A., Yan, G.-Y.: HGIMDA: heterogeneous graph inference for miRNA-disease association prediction. Oncotarget 7(40), 65257 (2016)

    Article  Google Scholar 

  11. Chen, X., You, Z.-H., Yan, G.-Y., Gong, D.-W.: IRWRLDA: improved random walk with restart for lncRNA-disease association prediction. Oncotarget 7(36), 57919 (2016)

    Article  Google Scholar 

  12. Chen, X., Zhang, D.-H., You, Z.-H.: A heterogeneous label propagation approach to explore the potential associations between miRNA and disease. J. Transl. Med. 16(1), 348 (2018)

    Article  Google Scholar 

  13. Chen, Z.-H., You, Z.-H., Guo, Z.-H., Yi, H.-C., Luo, G.-X., Wang, Y.-B.: Prediction of drug-target interactions from multi-molecular network based on deep walk embedding model. Front. Bioeng. Biotechnol. 8, 338 (2020)

    Article  Google Scholar 

  14. Chen, Z.-H., You, Z.-H., Li, L.-P., Guo, Z.-H., Hu, P.-W., Jiang, H.-J.: Combining LSTM network model and wavelet transform for predicting self-interacting proteins. In: Huang, D.-S., Bevilacqua, V., Premaratne, P. (eds.) ICIC 2019. LNCS, vol. 11643, pp. 166–174. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26763-6_16

    Chapter  Google Scholar 

  15. Cheng, F., et al.: Prediction of drug-target interactions and drug repositioning via network-based inference. PLoS Comput. Biol. 8(5) (2012)

    Google Scholar 

  16. Cheng, L., et al.: LncRNA2Target v2. 0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucl. Acids Res. 47(D1), D140–D144 (2018)

    Google Scholar 

  17. Chung, S., Nakagawa, H., Uemura, M., Piao, L., Ashikawa, K., Hosono, N., Morizono, T.: Association of a novel long non-coding RNA in 8q24 with prostate cancer susceptibility. Cancer Sci. 102(1), 245–252 (2011)

    Article  Google Scholar 

  18. Cummings, J., Ward, T.H., Greystoke, A., Ranson, M., Dive, C.: Biomarker method validation in anticancer drug development. Br. J. Pharmacol. 153(4), 646–656 (2008)

    Article  Google Scholar 

  19. Dai, E., et al.: ncDR: a comprehensive resource of non-coding RNAs involved in drug resistance. Bioinformatics 33(24), 4010–4011 (2017)

    Article  Google Scholar 

  20. Emig, D., et al.: Drug target prediction and repositioning using an integrated network-based approach. PloS ONE 8(4) (2013)

    Google Scholar 

  21. Guo, Z.-H., Yi, H.-C., You, Z.-H.: Construction and comprehensive analysis of a molecular association network via lncRNA–miRNA–disease–drug–protein graph. Cells 8(8), 866 (2019)

    Article  Google Scholar 

  22. Guo, Z.-H., You, Z.-H., Huang, D.-S., Yi, H.-C., Chen, Z.-H., Wang, Y.-B.: A learning based framework for diverse biomolecule relationship prediction in molecular association network. Commun. Biol. 3(1), 1–9 (2020)

    Article  Google Scholar 

  23. Guo, Z.-H., et al.: MeSHHeading2vec: a new method for representing MeSH headings as vectors based on graph embedding algorithm. Brief. Bioinf. (2020). https://doi.org/10.1093/bib/bbaa037

    Article  Google Scholar 

  24. Guo, Z.-H., You, Z.-H., Li, L.-P., Wang, Y.-B., Chen, Z.-H.: Combining high speed ELM with a CNN feature encoding to predict LncRNA-Disease Associations. In: Huang, D.-S., Jo, K.-H., Huang, Z.-K. (eds.) ICIC 2019. LNCS, vol. 11644, pp. 406–417. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26969-2_39

    Chapter  Google Scholar 

  25. Guo, Z.-H., et al.: Bioentity2vec: Attribute-and behavior-driven representation for predicting multi-type relationships between bioentities. GigaScience 9(6), giaa032 (2020)

    Google Scholar 

  26. Guo, Z.-H., You, Z.-H., Wang, Y.-B., Yi, H.-C., Chen, Z.-H.: A learning-based method for LncRNA-disease association identification combing similarity information and rotation forest. iScience 19, 786–795 (2019). https://doi.org/10.1016/j.isci.2019.08.030

  27. Guo, Z.-H., You, Z.-H., Yi, H.-C.: Integrative construction and analysis of molecular association network in human cells by fusing node attribute and behavior information. Mol. Therap. Nucl. Acids 19, 498–506 (2020)

    Article  Google Scholar 

  28. Hafner, M., Niepel, M., Sorger, P.K.: Alternative drug sensitivity metrics improve preclinical cancer pharmacogenomics. Nat. Biotechnol. 35(6), 500–502 (2017)

    Article  Google Scholar 

  29. Hay, M., Thomas, D.W., Craighead, J.L., Economides, C., Rosenthal, J.: Clinical development success rates for investigational drugs. Nat. Biotechnol. 32(1), 40–51 (2014)

    Article  Google Scholar 

  30. Hu, P., Huang, Y.-A., Chan, K.C., You, Z.-H.: Learning multimodal networks from heterogeneous data for prediction of lncRNA-miRNA interactions. IEEE/ACM Trans. Comput. Biol. Bioinf. (2019)

    Google Scholar 

  31. Huang, Y.-A., Chan, K.C., You, Z.-H.: Constructing prediction models from expression profiles for large scale lncRNA–miRNA interaction profiling. Bioinformatics 34(5), 812–819 (2017)

    Article  Google Scholar 

  32. Huang, Y.-A., Chan, K.C., You, Z.-H.: Constructing prediction models from expression profiles for large scale lncRNA–miRNA interaction profiling. Bioinformatics 34(5), 812–819 (2018)

    Article  Google Scholar 

  33. Huang, Y.-A., Hu, P., Chan, K.C., You, Z.-H.: Graph convolution for predicting associations between miRNA and drug resistance. Bioinformatics 36(3), 851–858 (2020)

    Google Scholar 

  34. Huang, Y.-A., You, Z.-H., Chen, X.: A systematic prediction of drug-target interactions using molecular fingerprints and protein sequences. Curr. Protein Peptide Sci. 19(5), 468–478 (2018)

    Article  Google Scholar 

  35. Huang, Y.-A., You, Z.-H., Chen, X., Chan, K., Luo, X.: Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding. BMC Bioinf. 17(1), 184 (2016)

    Article  Google Scholar 

  36. Huang, Z.-A., et al.: PBHMDA: path-based human microbe-disease association prediction. Front. Microbiol. 8, 233 (2017)

    Google Scholar 

  37. Huang, Z.-A., Huang, Y.-A., You, Z.-H., Zhu, Z., Sun, Y.: Novel link prediction for large-scale miRNA-lncRNA interaction network in a bipartite graph. BMC Med. Genom. 11(6), 113 (2018)

    Article  Google Scholar 

  38. Jin, X., Feng, C.-Y., Xiang, Z., Chen, Y.-P., Li, Y.-M.: CircRNA expression pattern and circRNA-miRNA-mRNA network in the pathogenesis of nonalcoholic steatohepatitis. Oncotarget 7(41), 66455 (2016)

    Article  Google Scholar 

  39. Kim, S., et al.: PubChem 2019 update: improved access to chemical data. Nucl. Acids Res. 47(D1), D1102–D1109 (2019)

    Article  Google Scholar 

  40. Kozomara, A., Birgaoanu, M., Griffiths-Jones, S.: miRBase: from microRNA sequences to function. Nucl. Acids Res. 47(D1), D155–D162 (2018)

    Article  Google Scholar 

  41. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Paper presented at the International Conference on Machine Learning (2014)

    Google Scholar 

  42. Lee, C.Y., Chen, Y.-P.: Prediction of drug adverse events using deep learning in pharmaceutical discovery. Brief. Bioinf. (2020)

    Google Scholar 

  43. Li, J.-Q., Rong, Z.-H., Chen, X., Yan, G.-Y., You, Z.-H.: MCMDA: Matrix completion for MiRNA-disease association prediction. Oncotarget 8(13), 21187 (2017)

    Article  Google Scholar 

  44. Li, J.-Q., You, Z.-H., Li, X., Ming, Z., Chen, X.: PSPEL: in silico prediction of self-interacting proteins from amino acids sequences using ensemble learning. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 14(5), 1165–1172 (2017)

    Article  Google Scholar 

  45. Li, S., You, Z.-H., Guo, H., Luo, X., Zhao, Z.-Q.: Inverse-free extreme learning machine with optimal information updating. IEEE Trans. Cybern. 46(5), 1229–1241 (2015)

    Article  Google Scholar 

  46. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  47. Lin, Y., et al.: RNAInter in 2020: RNA interactome repository with increased coverage and annotation. Nucl. Acids Res. 48(D1), D189–D197 (2020)

    Article  Google Scholar 

  48. Liu, X., et al.: SM2miR: a database of the experimentally validated small molecules’ effects on microRNA expression. Bioinformatics 29(3), 409–411 (2013)

    Article  Google Scholar 

  49. Ma, L., et al.: Multi-neighborhood learning for global alignment in biological networks. IEEE/ACM Trans. Comput. Biol. Bioinf. (2020)

    Google Scholar 

  50. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Paper Presented at the Advances in Neural Information Processing Systems (2013)

    Google Scholar 

  51. Öztürk, H., Ozkirimli, E., Özgür, A.: A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction. BMC Bioinf. 17(1), 128 (2016)

    Article  Google Scholar 

  52. Pammolli, F., Magazzini, L., Riccaboni, M.: The productivity crisis in pharmaceutical R&D. Nat. Rev. Drug Discov. 10(6), 428–438 (2011)

    Article  Google Scholar 

  53. Santos, R., et al.: A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 16(1), 19 (2017)

    Article  Google Scholar 

  54. Wang, L., et al.: An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences. Oncotarget 8(3), 5149 (2017)

    Article  Google Scholar 

  55. Wang, L., et al.: LMTRDA: using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities. PLoS Comput. Biol. 15(3), e1006865 (2019)

    Article  MathSciNet  Google Scholar 

  56. Wang, L., et al.: A computational-based method for predicting drug–target interactions by using stacked autoencoder deep neural network. J. Comput. Biol. 25(3), 361–373 (2018)

    Article  MathSciNet  Google Scholar 

  57. Wang, L., You, Z.-H., Chen, X., Yan, X., Liu, G., Zhang, W.: RFDT: a rotation forest-based predictor for predicting drug-target interactions using drug structure and protein sequence information. Curr. Protein Peptid. Sci. 19(5), 445–454 (2018)

    Article  Google Scholar 

  58. Wang, L., You, Z.-H., Huang, Y.-A., Huang, D.-S., Chan, K.C.: An efficient approach based on multi-sources information to predict circRNA–disease associations using deep convolutional neural network. Bioinformatics 36(13), 4038–4046 (2020)

    Article  Google Scholar 

  59. Wang, M.-N., You, Z.-H., Wang, L., Li, L.-P., Zheng, K.: LDGRNMF: LncRNA-disease associations prediction based on graph regularized non-negative matrix factorization. Neurocomputing (2020)

    Google Scholar 

  60. Wang, Y.-B., et al.: Predicting protein–protein interactions from protein sequences by a stacked sparse autoencoder deep neural network. Mol. BioSyst. 13(7), 1336–1344 (2017)

    Article  Google Scholar 

  61. Wang, Y.-B., You, Z.-H., Yi, H., Chen, Z.-H., Guo, Z.-H., Zheng, K.: Combining evolutionary information and sparse bayesian probability model to accurately predict self-interacting proteins. In: Huang, D.-S., Jo, K.-H., Huang, Z.-K. (eds.) ICIC 2019. LNCS, vol. 11644, pp. 460–467. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26969-2_44

    Chapter  Google Scholar 

  62. Wang, Y., You, Z.-H., Yang, S., Li, X., Jiang, T.-H., Zhou, X.: A high efficient biological language model for predicting protein-protein interactions. Cells 8(2), 122 (2019)

    Article  Google Scholar 

  63. Weininger, D.: SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28(1), 31–36 (1988)

    Google Scholar 

  64. Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucl. Acids Res. 46(D1), D1074–D1082 (2017)

    Google Scholar 

  65. Wong, L., You, Z.-H., Guo, Z.-H., Yi, H.-C., Chen, Z.-H., Cao, M.-Y.: MIPDH: A Novel Computational Model for Predicting microRNA–mRNA Interactions by DeepWalk on a Heterogeneous Network. ACS Omega (2020)

    Google Scholar 

  66. Yi, H.-C., et al.: Learning distributed representations of RNA and protein sequences and its application for predicting lncRNA-protein interactions. Comput. Struct. Biotechnol. J. 18, 20–26 (2020)

    Article  Google Scholar 

  67. Yi, H.-C., You, Z.-H., Huang, D.-S., Li, X., Jiang, T.-H., Li, L.-P.: A deep learning framework for robust and accurate prediction of ncRNA-protein interactions using evolutionary information. Mol. Therap. Nucl. Acids 11, 337–344 (2018)

    Article  Google Scholar 

  68. Yi, H.-C., You, Z.-H., Wang, Y.-B., Chen, Z.-H., Guo, Z.-H., Zhu, H.-J.: In Silico identification of anticancer peptides with stacking heterogeneous ensemble learning model and sequence information. In: Huang, D.-S., Jo, K.-H., Huang, Z.-K. (eds.) ICIC 2019. LNCS, vol. 11644, pp. 313–323. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26969-2_30

    Chapter  Google Scholar 

  69. You, Z.-H., Chan, K.C., Hu, P.: Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest. PLoS ONE 10(5), e0125811 (2015)

    Article  Google Scholar 

  70. You, Z.-H., et al.: PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction. PLoS Comput. Biol. 13(3), e1005455 (2017)

    Article  Google Scholar 

  71. You, Z.-H., Lei, Y.-K., Zhu, L., Xia, J., Wang, B.: Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinf. 14(8), S10 (2013). https://doi.org/10.1186/1471-2105-14-S8-S10

    Article  Google Scholar 

  72. You, Z.-H., Yu, J.-Z., Zhu, L., Li, S., Wen, Z.-K.: A MapReduce based parallel SVM for large-scale predicting protein–protein interactions. Neurocomputing 145, 37–43 (2014)

    Article  Google Scholar 

  73. You, Z.-H., Zhou, M., Luo, X., Li, S.: Highly efficient framework for predicting interactions between proteins. IEEE Trans. Cybern. 47(3), 731–743 (2017)

    Article  Google Scholar 

  74. Zheng, K., You, Z.-H., Li, J.-Q., Wang, L., Guo, Z.-H., Huang, Y.-A.: iCDA-CGR: Identification of circRNA-disease associations based on Chaos Game Representation. PLoS Comput. Biol. 16(5), e1007872 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank all the editors and anonymous reviewers for their constructive advices.

Funding

This work is supported in part by the National Natural Science Foundation of China (Grant nos. 61722212, 61902342). The authors would like to thank the editors and anonymous reviewers for their constructive advice.

Author information

Authors and Affiliations

Authors

Contributions

Z-H. G. and Z-H. Y. considered the algorithm, arranged the datasets, and performed the analyses. H-C. Y., Y-B. W. and Z-H. C. wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhu-Hong You .

Editor information

Editors and Affiliations

Ethics declarations

The authors declare that they have no competing interests.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, ZH., You, ZH., Li, LP., Chen, ZH., Yi, HC., Wang, YB. (2020). Inferring Drug-miRNA Associations by Integrating Drug SMILES and MiRNA Sequence Information. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2020. Lecture Notes in Computer Science(), vol 12464. Springer, Cham. https://doi.org/10.1007/978-3-030-60802-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60802-6_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60801-9

  • Online ISBN: 978-3-030-60802-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics