The unknown knowns: a graph-based approach for temporal COVID-19 literature mining
ISSN: 1468-4527
Article publication date: 23 March 2021
Issue publication date: 24 August 2021
Abstract
Purpose
The COVID-19 pandemic has sparked a remarkable volume of research literature, and scientists are increasingly in need of intelligent tools to cut through the noise and uncover relevant research directions. As a response, the authors propose a novel framework. In this framework, the authors develop a novel weighted semantic graph model to compress the research studies efficiently. Also, the authors present two analyses on this graph to propose alternative ways to uncover additional aspects of COVID-19 research.
Design/methodology/approach
The authors construct the semantic graph using state-of-the-art natural language processing (NLP) techniques on COVID-19 publication texts (>100,000 texts). Next, the authors conduct an evolutionary analysis to capture the changes in COVID-19 research across time. Finally, the authors apply a link prediction study to detect novel COVID-19 research directions that are so far undiscovered.
Findings
Findings reveal the success of the semantic graph in capturing scientific knowledge and its evolution. Meanwhile, the prediction experiments provide 79% accuracy on returning intelligible links, showing the reliability of the methods for predicting novel connections that could help scientists discover potential new directions.
Originality/value
To the authors’ knowledge, this is the first study to propose a holistic framework that includes encoding the scientific knowledge in a semantic graph, demonstrates an evolutionary examination of past and ongoing research and offers scientists with tools to generate new hypotheses and research directions through predictive modeling and deep machine learning techniques.
Keywords
Acknowledgements
In the interest of transparency, data sharing and reproducibility, the author(s) of this article have made the data underlying their research openly available. It can be accessed by following the link here: https://github.com/ubayram/COVIDGraphProject
The authors would like to thank Dakota Murray for his support and help.
Funding: This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Citation
Bayram, U., Roy, R., Assalil, A. and BenHiba, L. (2021), "The unknown knowns: a graph-based approach for temporal COVID-19 literature mining", Online Information Review, Vol. 45 No. 4, pp. 687-708. https://doi.org/10.1108/OIR-12-2020-0562
Publisher
:Emerald Publishing Limited
Copyright © 2021, Emerald Publishing Limited