Automatic Extraction for Creating a Lexical Repository of Abbreviations in the Biomedical Literature

Song, Min; Song, Il-Yeol; Lee, Ki Jung

doi:10.1007/11823728_37

Min Song¹⁸,
Il-Yeol Song¹⁹ &
Ki Jung Lee¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

802 Accesses
2 Citations

Abstract

The sheer volume of biomedical text is growing at an exponential rate. This growth creates challenges for both human readers and automatic text processing algorithms. One such challenge arises from common and uncontrolled usages of abbreviations in the biomedical literature. This, in turn, requires that biomedical lexical ontologies be continuously updated. In this paper, we propose a hybrid approach combining lexical analysis techniques and the Support Vector Machine (SVM) to create an automatically generated and maintained lexicon of abbreviations. The proposed technique is differentiated from others in the following aspects: 1) It incorporates lexical analysis techniques to supervised learning for extracting abbreviations. 2) It makes use of text chunking techniques to identify long forms of abbreviations. 3) It significantly improves Recall compared to other techniques. The experimental results show that our approach outperforms the leading abbreviation algorithms, ExtractAbbrev and ALICE, at least by 6% and 13.9%, respectively, in both Precision and Recall on the Gold Standard Development corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Leveraging Large Language Models for Clinical Abbreviation Disambiguation

Article 27 February 2024

Sequence Labeling for Disambiguating Medical Abbreviations

Article 14 September 2023

Development of a Machine Learning Framework for Biomedical Text Mining

References

Ao, H., Takagi, T.: ALICE: An algorithm to extract abbreviations from MEDLINE. Journal of the American Medical Informatics Association 12, 576–586 (2005)
Article Google Scholar
Aronson, A.R.: Effective Mapping of Biomedical Text to the UMLS Metathesaurus: the MetaMap Program. In: Proceedings of the AMIA Symposium, pp. 17–21 (2001)
Google Scholar
Chang, J.T., Schütze, H., Altman, R.B.: Creating an Online Dictionary of Abbreviations from MEDLINE. The Journal of the American Medical Informatics Association 9, 612–620 (2002)
Article Google Scholar
Cohen, A., Hersh, W.: A Survey of Current Work in Biomedical Text Mining. Briefing in Bioinformatics 6, 57–71 (2005)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector Networks. Machine Learning 20, 273–297 (1995)
MATH Google Scholar
Kudo, T., Matsumoto, Y.: Use of Support Vector Learning for Chunk Identification. In: Proceedings of the CoNLL-2000 and LLL-2000, pp. 142–144 (2000)
Google Scholar
Liu, H., Aronson, A.R., Friedman, C.: A Study of Abbreviations in MEDLINE Abstracts. In: Proceedings of the AMIA Annual Fall Symposium, pp. 64–69 (2002)
Google Scholar
Liu, H., Friedman, C.: Mining Terminological Knowledge in Large Biomedical Corpora. Proceedings of the Pacific Symposium on Biocomputing 8, 415–426 (2003)
Google Scholar
Schwartz, A.S., Hearst, M.A.: A simple algorithm for identifying abbreviation definitions in biomedical text. Proceedings of the Pacific Symposium on Biocomputing 8, 451–462 (2003)
Google Scholar
Yu, H., Hripcsak, G., Friedman, C.: Mapping abbreviations to full forms in biomedical articles. Journal of the American Medical Informatics Association 9, 162–172 (2002)
Article Google Scholar
Yu, Z., Tsuruoka, Y., Tsujii, J.: Automatic Resolution of Ambiguous Abbreviations in Biomedical Texts using Support Vector Machines and One Sense Per Discourse Hypothesis. In: Proceedings of the SIGIR (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Systems Department, New Jersey Institute of Technology, University Heights, Newark, NJ, 07102-1982, 01
Min Song
College of Information Science & Technology, Drexel University, Philadelphia, PA, 19104
Il-Yeol Song & Ki Jung Lee

Authors

Min Song
View author publications
You can also search for this author in PubMed Google Scholar
Il-Yeol Song
View author publications
You can also search for this author in PubMed Google Scholar
Ki Jung Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9-11/188, A-1040, Wien, Austria
A Min Tjoa
Department of Software and Computing Systems, University of Alicante, Spain
Juan Trujillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, M., Song, IY., Lee, K.J. (2006). Automatic Extraction for Creating a Lexical Repository of Abbreviations in the Biomedical Literature. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_37

Download citation

DOI: https://doi.org/10.1007/11823728_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics