Semantic Disambiguation of Embedded Drug-Disease Associations Using Semantically Enriched Deep-Learning Approaches

Wawrzinek, Janus; Pinto, José María González; Wiehr, Oliver; Balke, Wolf-Tilo

doi:10.1007/978-3-030-59419-0_30

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12114))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2077 Accesses
2 Citations

Abstract

State-of-the-art approaches in the field of neural-embedding models (NEMs) enable progress in the automatic extraction and prediction of semantic relations between important entities like active substances, diseases, and genes. In particular, the prediction property is making them valuable for important research-related tasks such as hypothesis generation and drug-repositioning. A core challenge in the biomedical domain is to have interpretable semantics from NEMs that can distinguish, for instance, between the following two situations: a) drug \( x \) induces disease \( y \) and b) drug \( x \) treats disease \( y \). However, NEMs alone cannot distinguish between associations such as treats or induces. Is it possible to develop a model to learn a latent representation from the NEMs capable of such disambiguation? To what extent do we need domain knowledge to succeed in the task? In this paper, we answer both questions and show that our proposed approach not only succeeds in the disambiguation task but also advances current growing research efforts to find real predictions using a sophisticated retrospective analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Gefen, D., et al.: Identifying patterns in medical records through latent semantic analysis. Commun. ACM 61(6), 72–77 (2018)
Article Google Scholar
Chiu, B., Crichton, G., Korhonen, A., Pyysalo, S.: How to train good word embeddings for biomedical NLP. In: Proceedings of the 15th Workshop on Biomedical Natural Language Processing, pp. 166–174, 2016 August
Google Scholar
Chiang, A.P., Butte, A.J.: Systematic evaluation of drug–disease relationships to identify leads for novel drug uses. Clin. Pharmacol. Ther. 86(5), 507–510 (2009)
Article Google Scholar
Herskovic, J.R., Tanaka, L.Y., Hersh, W., Bernstam, E.V.: A day in the life of PubMed: analysis of a typical day’s query log. J. Am. Med. Inform. Assoc. 14(2), 212–220 (2007)
Article Google Scholar
Larsen, P.O., Von Ins, M.: The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics 84(3), 575–603 (2010)
Article Google Scholar
Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 238–247 (2014)
Google Scholar
Gottlieb, A., Stein, G.Y., Ruppin, E., Sharan, R.: PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol. Syst. Biol. 7(1), 496 (2011)
Article Google Scholar
Zhang, W., et al.: Predicting drug-disease associations based on the known association bipartite network. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 503–509. IEEE, 2017 November
Google Scholar
Tshitoyan, V., et al.: Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571(7763), 95 (2019)
Article Google Scholar
Agarwal, P., Searls, D.B.: Can literature analysis identify innovation drivers in drug discovery? Nat. Rev. Drug Disc. 8(11), 865 (2009)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Dudley, J.T., Deshpande, T., Butte, A.J.: Exploiting drug–disease relationships for computational drug repositioning. Brief. Bioinf. 12(4), 303–311 (2011)
Article Google Scholar
Lev, G., Klein, B., Wolf, L.: In defense of word embedding for generic text representation. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 35–50. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19581-0_3
Chapter Google Scholar
Keiser, M.J., et al.: Predicting new molecular targets for known drugs. Nature 462(7270), 175 (2009)
Article Google Scholar
Lotfi Shahreza, M., Ghadiri, N., Mousavi, S.R., Varshosaz, J., Green, J.R.: A review of network-based approaches to drug repositioning. Brief. Bioinform. 19, 878–892 (2017)
Article Google Scholar
Wawrzinek, J., Balke, W.-T.: Measuring the semantic world – how to map meaning to high-dimensional entity clusters in PubMed? In: Dobreva, M., Hinze, A., Žumer, M. (eds.) ICADL 2018. LNCS, vol. 11279, pp. 15–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04257-8_2
Chapter Google Scholar
Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)
Article MathSciNet Google Scholar
Rinaldi, F., Clematide, S., Hafner, S.: Ranking of CTD articles and interactions using the OntoGene pipeline. In: Proceedings of the 2012 BioCreative Workshop, April 2012
Google Scholar
Jensen, L.J., Saric, J., Bork, P.: Literature mining for the biologist: from information retrieval to biological discovery. Nat. Rev. Genet. 7(2), 119 (2006)
Article Google Scholar
Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751 (2013)
Google Scholar
Wick, Christoph: Deep Learning. Nature 521(7553), 436–444 (2016). MIT Press, 800
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR. abs/1412.6980 (2014)
Google Scholar
Hinton, G.E., et al.: Improving neural networks by preventing co-adaptation of feature detectors (2012)
Google Scholar
Patrick, M.T., et al.: Drug repurposing prediction for immune-mediated cutaneous diseases using a word-embedding–based machine learning approach. J. Invest. Dermatol. 139(3), 683–691 (2019)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

IFIS TU-Braunschweig, Mühlenpfordstrasse 23, 38106, Brunswick, Germany
Janus Wawrzinek, José María González Pinto, Oliver Wiehr & Wolf-Tilo Balke

Authors

Janus Wawrzinek
View author publications
You can also search for this author in PubMed Google Scholar
José María González Pinto
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Wiehr
View author publications
You can also search for this author in PubMed Google Scholar
Wolf-Tilo Balke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Janus Wawrzinek .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of Systems Engineering and En, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wawrzinek, J., Pinto, J.M.G., Wiehr, O., Balke, WT. (2020). Semantic Disambiguation of Embedded Drug-Disease Associations Using Semantically Enriched Deep-Learning Approaches. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-59419-0_30
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59418-3
Online ISBN: 978-3-030-59419-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics