Query Resolution of Literature Knowledge Graphs Using Hybrid Document Embeddings

Muhammad, Iqra; Coenen, Frans; Gamble, Carol; Kearney, Anna; Williamson, Paula

doi:10.1007/978-3-031-21441-7_7

Iqra Muhammad⁹,
Frans Coenen⁹,
Carol Gamble⁹,
Anna Kearney⁹ &
…
Paula Williamson⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13652))

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

569 Accesses

Abstract

Literature Knowledge Graphs play a critical role in helping domain experts carry out query resolution for finding relevant articles in published literature. Such knowledge graphs are usually in the form of Curated Document Databases (CDDs). Domain Experts and researchers typically query such literature knowledge graphs using some form of query-resolution mechanism. Machine learning techniques can be used to automate query-resolution. This paper presents a document query-resolution mechanism, given a query and set of documents in a knowledge graph, based on a hybrid word embedding that combines knowledge graph embeddings with “traditional” embeddings. A query-document data set extracted from a clinical trials CDD (the ORRCA CDD) was used. Three “traditional” word embeddings were considered: CBOW, BERT and SciBERT. The evaluation demonstrated that hybrid embeddings produced better results than when the embedding models were used in isolation. A best Mean Average Precision of 0.486 was obtained when using a CBOW and random walk knowledge graph hybrid embedding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.orrca.org.uk/.
2.
https://www.nltk.org/.

References

Ammar, W., et al.: Construction of the literature graph in semantic scholar. In: NAACL HLT 2018, pp. 84–91 (2018)
Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3615–3620 (2019)
Google Scholar
Chen, C., Ross, K.E., Gavali, S., Cowart, J.E., Wu, C.H.: Covid-19 knowledge graph from semantic integration of biomedical literature and databases. Bioinformatics 37(23), 4597–4598 (2021)
Google Scholar
Dörpinghaus, J., Stefan, A., Schultz, B., Jacobs, M.: Context mining and graph queries on giant biomedical knowledge graphs. Knowl. Inf. Syst. 64(5), 1239–1262 (2022)
Article Google Scholar
Grover, A., Leskovec, J.: Node2Vec: scalable feature learning for networks. In: KDD 2016, pp. 855–864 (2016)
Google Scholar
Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64 (2016)
Google Scholar
Jatnika, D., Bijaksana, M.A., Suryani, A.A.: Word2vec model analysis for semantic similarities in English words. Procedia Comput. Sci. 157, 160–167 (2019)
Google Scholar
Kearney, A., et al.: Development of an online resource for recruitment research in clinical trials to organise and map current literature. Clin. Trials 15(6), 533–542 (2018)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Google Scholar
Kowsher, Md., et al.; An enhanced neural word embedding model for transfer learning. Appl. Sci. 12(6), 2848 (2022)
Google Scholar
Liang, X., Li, D., Song, M., Madden, A., Ding, Y., Yi, B.: Predicting biomedical relationships using the knowledge and graph embedding cascade model. PLoS ONE 14(6), e0218264 (2019)
Article Google Scholar
Liji, S.K., Ilyas, P.M.: Semantic Malayalam dialogue system for Covid-19 question answering using word embedding and cosine similarity. In: 2021 International Conference on Advances in Computing and Communications (ICACC), pp. 1–6. IEEE (2021)
Google Scholar
Liu, Z.-H., Xiong, C., Sun, M., Liu, Z.: Entity-duet neural ranking: understanding the role of knowledge graph semantics in neural information retrieval. In: ACL, no. 1 (2018)
Google Scholar
Mai, G., Yan, B., Janowicz, K., Zhu, R.: Relaxing unanswerable geographic questions using a spatially explicit knowledge graph embedding model. In: Kyriakidis, P., Hadjimitsis, D., Skarlatos, D., Mansourian, A. (eds.) AGILE 2019. LNGC, pp. 21–39. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-14745-7_2
Chapter Google Scholar
Muhammad, I., Bollegala, D., Coenen, F., Gamble, C., Kearney, A., Williamson, P.: Document ranking for curated document databases using BERT and knowledge graph embeddings: introducing GRAB-rank. In: Golfarelli, M., Wrembel, R., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2021. LNCS, vol. 12925, pp. 116–127. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86534-4_10
Chapter Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Google Scholar
Sang, S., et al.: Gredel: a knowledge graph embedding based method for drug discovery from biomedical literatures. IEEE Access 7, 8404–8415 (2018)
Article Google Scholar
Sharma, S.: Fact-finding knowledge-aware search engine. In: Sharma, N., Chakrabarti, A., Balas, V.E., Bruckstein, A.M. (eds.) Data Management, Analytics and Innovation. LNDECT, vol. 71, pp. 225–235. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-2937-2_17
Chapter Google Scholar
Shi, L., Li, S., Yang, X., Qi, J., Pan, G., Zhou, B.: Semantic health knowledge graph: semantic integration of heterogeneous medical knowledge and services. BioMed. Res. Int. (2017)
Google Scholar
Silva, A., Mendoza, M.: Improving query expansion strategies with word embeddings. In: Proceedings of the ACM Symposium on Document Engineering 2020, pp. 1–4 (2020)
Google Scholar
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
Google Scholar
Wang, Q., et al.: Covid-19 literature knowledge graph construction and drug repurposing report generation. arXiv preprint arXiv:2007.00576 (2020)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Article Google Scholar
Wang, S., Zhou, W., Jiang, C.: A survey of word embeddings based on deep learning. Computing 102(3), 717–740 (2020)
Article MathSciNet MATH Google Scholar
Wise, C., et al.: Covid-19 knowledge graph: accelerating information retrieval and discovery for scientific literature. In: Proceedings of Knowledgeable NLP: The First Workshop on Integrating Structured Knowledge and Neural Networks for NLP, pp. 1–10 (2020)
Google Scholar
Wu, T., Wang, Y., Wang, Y., Zhao, E., Yuan, Y., Yang, Z.: Representation learning of EHR data via graph-based medical entity embedding. arXiv preprint arXiv:1910.02574 (2019)
Yang, W., Zhang, H., Lin, J.: Simple applications of BERT for ad hoc document retrieval. arXiv preprint arXiv:1903.10972 (2019)
Tong, Yu., et al.: Knowledge graph for TCM health preservation: design, construction, and applications. Artif. Intell. Med. 77, 48–52 (2017)
Article Google Scholar
Zuccon, G., Koopman, B., Bruza, P., Azzopardi, L.: Integrating and evaluating neural word embeddings in information retrieval. In: Proceedings of the 20th Australasian Document Computing Symposium, pp. 1–8 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Liverpool, Liverpool, L693BX, UK
Iqra Muhammad, Frans Coenen, Carol Gamble, Anna Kearney & Paula Williamson

Authors

Iqra Muhammad
View author publications
You can also search for this author in PubMed Google Scholar
Frans Coenen
View author publications
You can also search for this author in PubMed Google Scholar
Carol Gamble
View author publications
You can also search for this author in PubMed Google Scholar
Anna Kearney
View author publications
You can also search for this author in PubMed Google Scholar
Paula Williamson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Iqra Muhammad .

Editor information

Editors and Affiliations

University of Portsmouth, Portsmouth, UK
Max Bramer
DFKI: German Research Center for Artificial Intelligence, Oldenburg, Germany
Frederic Stahl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muhammad, I., Coenen, F., Gamble, C., Kearney, A., Williamson, P. (2022). Query Resolution of Literature Knowledge Graphs Using Hybrid Document Embeddings. In: Bramer, M., Stahl, F. (eds) Artificial Intelligence XXXIX. SGAI-AI 2022. Lecture Notes in Computer Science(), vol 13652. Springer, Cham. https://doi.org/10.1007/978-3-031-21441-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-21441-7_7
Published: 05 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21440-0
Online ISBN: 978-3-031-21441-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Query Resolution of Literature Knowledge Graphs Using Hybrid Document Embeddings