Abstract
Literature Knowledge Graphs play a critical role in helping domain experts carry out query resolution for finding relevant articles in published literature. Such knowledge graphs are usually in the form of Curated Document Databases (CDDs). Domain Experts and researchers typically query such literature knowledge graphs using some form of query-resolution mechanism. Machine learning techniques can be used to automate query-resolution. This paper presents a document query-resolution mechanism, given a query and set of documents in a knowledge graph, based on a hybrid word embedding that combines knowledge graph embeddings with “traditional” embeddings. A query-document data set extracted from a clinical trials CDD (the ORRCA CDD) was used. Three “traditional” word embeddings were considered: CBOW, BERT and SciBERT. The evaluation demonstrated that hybrid embeddings produced better results than when the embedding models were used in isolation. A best Mean Average Precision of 0.486 was obtained when using a CBOW and random walk knowledge graph hybrid embedding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ammar, W., et al.: Construction of the literature graph in semantic scholar. In: NAACL HLT 2018, pp. 84–91 (2018)
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3615–3620 (2019)
Chen, C., Ross, K.E., Gavali, S., Cowart, J.E., Wu, C.H.: Covid-19 knowledge graph from semantic integration of biomedical literature and databases. Bioinformatics 37(23), 4597–4598 (2021)
Dörpinghaus, J., Stefan, A., Schultz, B., Jacobs, M.: Context mining and graph queries on giant biomedical knowledge graphs. Knowl. Inf. Syst. 64(5), 1239–1262 (2022)
Grover, A., Leskovec, J.: Node2Vec: scalable feature learning for networks. In: KDD 2016, pp. 855–864 (2016)
Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64 (2016)
Jatnika, D., Bijaksana, M.A., Suryani, A.A.: Word2vec model analysis for semantic similarities in English words. Procedia Comput. Sci. 157, 160–167 (2019)
Kearney, A., et al.: Development of an online resource for recruitment research in clinical trials to organise and map current literature. Clin. Trials 15(6), 533–542 (2018)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Kowsher, Md., et al.; An enhanced neural word embedding model for transfer learning. Appl. Sci. 12(6), 2848 (2022)
Liang, X., Li, D., Song, M., Madden, A., Ding, Y., Yi, B.: Predicting biomedical relationships using the knowledge and graph embedding cascade model. PLoS ONE 14(6), e0218264 (2019)
Liji, S.K., Ilyas, P.M.: Semantic Malayalam dialogue system for Covid-19 question answering using word embedding and cosine similarity. In: 2021 International Conference on Advances in Computing and Communications (ICACC), pp. 1–6. IEEE (2021)
Liu, Z.-H., Xiong, C., Sun, M., Liu, Z.: Entity-duet neural ranking: understanding the role of knowledge graph semantics in neural information retrieval. In: ACL, no. 1 (2018)
Mai, G., Yan, B., Janowicz, K., Zhu, R.: Relaxing unanswerable geographic questions using a spatially explicit knowledge graph embedding model. In: Kyriakidis, P., Hadjimitsis, D., Skarlatos, D., Mansourian, A. (eds.) AGILE 2019. LNGC, pp. 21–39. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-14745-7_2
Muhammad, I., Bollegala, D., Coenen, F., Gamble, C., Kearney, A., Williamson, P.: Document ranking for curated document databases using BERT and knowledge graph embeddings: introducing GRAB-rank. In: Golfarelli, M., Wrembel, R., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2021. LNCS, vol. 12925, pp. 116–127. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86534-4_10
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Sang, S., et al.: Gredel: a knowledge graph embedding based method for drug discovery from biomedical literatures. IEEE Access 7, 8404–8415 (2018)
Sharma, S.: Fact-finding knowledge-aware search engine. In: Sharma, N., Chakrabarti, A., Balas, V.E., Bruckstein, A.M. (eds.) Data Management, Analytics and Innovation. LNDECT, vol. 71, pp. 225–235. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-2937-2_17
Shi, L., Li, S., Yang, X., Qi, J., Pan, G., Zhou, B.: Semantic health knowledge graph: semantic integration of heterogeneous medical knowledge and services. BioMed. Res. Int. (2017)
Silva, A., Mendoza, M.: Improving query expansion strategies with word embeddings. In: Proceedings of the ACM Symposium on Document Engineering 2020, pp. 1–4 (2020)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
Wang, Q., et al.: Covid-19 literature knowledge graph construction and drug repurposing report generation. arXiv preprint arXiv:2007.00576 (2020)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Wang, S., Zhou, W., Jiang, C.: A survey of word embeddings based on deep learning. Computing 102(3), 717–740 (2020)
Wise, C., et al.: Covid-19 knowledge graph: accelerating information retrieval and discovery for scientific literature. In: Proceedings of Knowledgeable NLP: The First Workshop on Integrating Structured Knowledge and Neural Networks for NLP, pp. 1–10 (2020)
Wu, T., Wang, Y., Wang, Y., Zhao, E., Yuan, Y., Yang, Z.: Representation learning of EHR data via graph-based medical entity embedding. arXiv preprint arXiv:1910.02574 (2019)
Yang, W., Zhang, H., Lin, J.: Simple applications of BERT for ad hoc document retrieval. arXiv preprint arXiv:1903.10972 (2019)
Tong, Yu., et al.: Knowledge graph for TCM health preservation: design, construction, and applications. Artif. Intell. Med. 77, 48–52 (2017)
Zuccon, G., Koopman, B., Bruza, P., Azzopardi, L.: Integrating and evaluating neural word embeddings in information retrieval. In: Proceedings of the 20th Australasian Document Computing Symposium, pp. 1–8 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Muhammad, I., Coenen, F., Gamble, C., Kearney, A., Williamson, P. (2022). Query Resolution of Literature Knowledge Graphs Using Hybrid Document Embeddings. In: Bramer, M., Stahl, F. (eds) Artificial Intelligence XXXIX. SGAI-AI 2022. Lecture Notes in Computer Science(), vol 13652. Springer, Cham. https://doi.org/10.1007/978-3-031-21441-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-21441-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21440-0
Online ISBN: 978-3-031-21441-7
eBook Packages: Computer ScienceComputer Science (R0)