ABEE: automated bio entity extraction from biomedical text documents
Data Technologies and Applications
ISSN: 2514-9288
Article publication date: 25 January 2023
Issue publication date: 25 April 2023
Abstract
Purpose
The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.
Design/methodology/approach
In the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.
Findings
The proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.
Research limitations/implications
As such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.
Practical implications
As far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.
Social implications
During the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.
Originality/value
In this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.
Keywords
Acknowledgements
The authors gratefully acknowledge the Department of Computer Science and Engineering of the National Institute of Technology Raipur for providing infrastructure and facilities necessary for this work.
Funding: This research is not funded by any financial institution.
Authors' contributions: A.K. and A.S. hypothesized and designed the idea of ABEE model. A.K. developed ABEE. A.K. and A.S. experimented and analyzed the results. A.S., as the supervisor of A.K., guided this research work. All authors read the final manuscript carefully and approved it.
Availability of data: All the corpora are openly licensed and available at https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data and https://github.com/SKumarAshutosh/ABEE.
Declaration of competing interests: The authors declare that they have no competing interests.
Citation
Kumar, A. and Sharaff, A. (2023), "ABEE: automated bio entity extraction from biomedical text documents", Data Technologies and Applications, Vol. 57 No. 2, pp. 222-244. https://doi.org/10.1108/DTA-04-2022-0151
Publisher
:Emerald Publishing Limited
Copyright © 2023, Emerald Publishing Limited