Skip to main content

Semantic Disambiguation of Embedded Drug-Disease Associations Using Semantically Enriched Deep-Learning Approaches

  • Conference paper
  • First Online:
Book cover Database Systems for Advanced Applications (DASFAA 2020)

Abstract

State-of-the-art approaches in the field of neural-embedding models (NEMs) enable progress in the automatic extraction and prediction of semantic relations between important entities like active substances, diseases, and genes. In particular, the prediction property is making them valuable for important research-related tasks such as hypothesis generation and drug-repositioning. A core challenge in the biomedical domain is to have interpretable semantics from NEMs that can distinguish, for instance, between the following two situations: a) drug \( x \) induces disease \( y \) and b) drug \( x \) treats disease \( y \). However, NEMs alone cannot distinguish between associations such as treats or induces. Is it possible to develop a model to learn a latent representation from the NEMs capable of such disambiguation? To what extent do we need domain knowledge to succeed in the task? In this paper, we answer both questions and show that our proposed approach not only succeeds in the disambiguation task but also advances current growing research efforts to find real predictions using a sophisticated retrospective analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://deeplearning4j.org/.

  2. 2.

    https://www.ncbi.nlm.nih.gov/pubmed/.

  3. 3.

    https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/.

  4. 4.

    https://www.drugbank.ca/.

  5. 5.

    https://lucene.apache.org/.

  6. 6.

    https://deeplearning4j.org/word2vec.

  7. 7.

    https://www.nlm.nih.gov/mesh/intro_trees.html.

  8. 8.

    https://www.whocc.no/atc_ddd_index/.

References

  1. Gefen, D., et al.: Identifying patterns in medical records through latent semantic analysis. Commun. ACM 61(6), 72–77 (2018)

    Article  Google Scholar 

  2. Chiu, B., Crichton, G., Korhonen, A., Pyysalo, S.: How to train good word embeddings for biomedical NLP. In: Proceedings of the 15th Workshop on Biomedical Natural Language Processing, pp. 166–174, 2016 August

    Google Scholar 

  3. Chiang, A.P., Butte, A.J.: Systematic evaluation of drug–disease relationships to identify leads for novel drug uses. Clin. Pharmacol. Ther. 86(5), 507–510 (2009)

    Article  Google Scholar 

  4. Herskovic, J.R., Tanaka, L.Y., Hersh, W., Bernstam, E.V.: A day in the life of PubMed: analysis of a typical day’s query log. J. Am. Med. Inform. Assoc. 14(2), 212–220 (2007)

    Article  Google Scholar 

  5. Larsen, P.O., Von Ins, M.: The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics 84(3), 575–603 (2010)

    Article  Google Scholar 

  6. Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 238–247 (2014)

    Google Scholar 

  7. Gottlieb, A., Stein, G.Y., Ruppin, E., Sharan, R.: PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol. Syst. Biol. 7(1), 496 (2011)

    Article  Google Scholar 

  8. Zhang, W., et al.: Predicting drug-disease associations based on the known association bipartite network. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 503–509. IEEE, 2017 November

    Google Scholar 

  9. Tshitoyan, V., et al.: Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571(7763), 95 (2019)

    Article  Google Scholar 

  10. Agarwal, P., Searls, D.B.: Can literature analysis identify innovation drivers in drug discovery? Nat. Rev. Drug Disc. 8(11), 865 (2009)

    Article  Google Scholar 

  11. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  12. Dudley, J.T., Deshpande, T., Butte, A.J.: Exploiting drug–disease relationships for computational drug repositioning. Brief. Bioinf. 12(4), 303–311 (2011)

    Article  Google Scholar 

  13. Lev, G., Klein, B., Wolf, L.: In defense of word embedding for generic text representation. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 35–50. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19581-0_3

    Chapter  Google Scholar 

  14. Keiser, M.J., et al.: Predicting new molecular targets for known drugs. Nature 462(7270), 175 (2009)

    Article  Google Scholar 

  15. Lotfi Shahreza, M., Ghadiri, N., Mousavi, S.R., Varshosaz, J., Green, J.R.: A review of network-based approaches to drug repositioning. Brief. Bioinform. 19, 878–892 (2017)

    Article  Google Scholar 

  16. Wawrzinek, J., Balke, W.-T.: Measuring the semantic world – how to map meaning to high-dimensional entity clusters in PubMed? In: Dobreva, M., Hinze, A., Žumer, M. (eds.) ICADL 2018. LNCS, vol. 11279, pp. 15–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04257-8_2

    Chapter  Google Scholar 

  17. Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)

    Article  MathSciNet  Google Scholar 

  18. Rinaldi, F., Clematide, S., Hafner, S.: Ranking of CTD articles and interactions using the OntoGene pipeline. In: Proceedings of the 2012 BioCreative Workshop, April 2012

    Google Scholar 

  19. Jensen, L.J., Saric, J., Bork, P.: Literature mining for the biologist: from information retrieval to biological discovery. Nat. Rev. Genet. 7(2), 119 (2006)

    Article  Google Scholar 

  20. Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751 (2013)

    Google Scholar 

  21. Wick, Christoph: Deep Learning. Nature 521(7553), 436–444 (2016). MIT Press, 800

    Google Scholar 

  22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR. abs/1412.6980 (2014)

    Google Scholar 

  23. Hinton, G.E., et al.: Improving neural networks by preventing co-adaptation of feature detectors (2012)

    Google Scholar 

  24. Patrick, M.T., et al.: Drug repurposing prediction for immune-mediated cutaneous diseases using a word-embedding–based machine learning approach. J. Invest. Dermatol. 139(3), 683–691 (2019)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janus Wawrzinek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wawrzinek, J., Pinto, J.M.G., Wiehr, O., Balke, WT. (2020). Semantic Disambiguation of Embedded Drug-Disease Associations Using Semantically Enriched Deep-Learning Approaches. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59419-0_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59418-3

  • Online ISBN: 978-3-030-59419-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics