Abstract
The number of publicly available clinical studies is constantly increasing, formulating a rather promising corpus of documents for clinical research purposes. However, the abbreviations used in these studies pose a serious barrier to any text mining technique. This paper presents a study conducted in the above domain, which used specifically developed tools and mechanisms in order to process a number of randomly selected documents from clinicaltrialsregister.eu. The analysis performed indicated that abbreviations appear at a large scale without their long form (aka expansion). In order to assess the abbreviations’ true meaning, it is necessary to utilize the appropriate corpus of documents, apply innovative algorithms and techniques to detect their possible expansions, and accordingly select the appropriate ones. Furthermore, the discrimination power of tokens has a distinctive role in abbreviations construction, and hence, it can facilitate the detection of acronym-type abbreviations. Additionally, the expressions in which abbreviations appear, as well as the preceding or following text are of primary importance for selecting the appropriate meaning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gale, W.A., Church, K.W., Yarowsky, D.: One sense per discourse. In: Proceedings of the Workshop on Speech and Natural Language HLT 1991, pp. 233–237. New York (1992)
Schwartz, S.A., Hearst, A.M.: A Simple algorithm for identifying abbreviation definitions in biomedical text. In: Proccedings of PSB, pp. 451–462 (2003)
EU Clinical Trials Register. www.clinicaltrialsregister.eu
Porter, M.F.: An algorithm for suffix stripping. Program 40(3), 211–218 (2006)
ClinicalTrials.gov. www.clinicaltrials.gov
Medical Subject Headings (MeSH). http://www.nlm.nih.gov/mesh/
Karanastasis, E., Andronikou, V., Chondrogiannis, E., Tsatsaronis, G., Eisinger, D., Petrova, A.: The OpenScienceLink architecture for novel services exploiting open access data in the biomedical domain. In: Proceedings of PCI 2014, pp. 28:1–28:6. ACM, New York (2014)
Xu, Y., Wang, Z., Lei, Y., Zhao, Y., Xue, Y.: MBA: a literature mining system for extracting biomedical abbreviations. BMC Bioinform. 10, 14 (2009)
McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Finding predominant word senses in untagged text. In: Proceedings of ACL 2004, Stroudsburg, PA, USA, pp. 280–287 (2004)
Stevenson, M., Guo, Y., Amri, A.A., Gaizauskas, R.: Disambiguation of biomedical abbreviations. In: Proceedings of BioNLP 2009, Boulder, Colorado, USA, pp. 71–79 (2009)
McInnes, B.T., Pedersen, T., Carlis, J.: Using UMLS concept unique identifiers (CUIs) for word sense disambiguation in the biomedical domain. In: AMIA 2007, pp. 533–537 (2007)
CT abbreviations-annotated corpus. http://147.102.19.246:8080/AbbrAnnotatedCorpus/
Chang, J.T., Schütze, H., Altman, R.B.: Creating an online dictionary of abbreviations from MEDLINE. J. Am. Med. Inform. Assoc. 9(6), 612–620 (2002)
Pustejovsky, J., Castaño, J., Cochran, B., Kotecki, M., Morrell, M.: Automatic extraction of acronym-meaning pairs from MEDLINE databases. Stud. Health Tech. I. 84(1), 371–375 (2001)
Zhou, W., Torvik, V.I., Smalheiser, N.R.: ADAM: another database of abbreviations in MEDLINE. Bioinformatics 22(22), 2813–2818 (2006)
Park, Y., Byrd, R.J.: Hybrid text mining for finding abbreviations and their definitions. In: Proceedings of EMNLP 2001 Conference, pp. 126–133 (2001)
Acknowledgements
This work is being supported by the OpenScienceLink project [8] and has been partially funded by the European Commission’s CIP-PSP under contract number 325101. This paper expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chondrogiannis, E., Andronikou, V., Karanastasis, E., Varvarigou, T. (2015). Meaning Inference of Abbreviations Appearing in Clinical Studies. In: Sierra-RodrĂguez, JL., Leal, JP., Simões, A. (eds) Languages, Applications and Technologies. SLATE 2015. Communications in Computer and Information Science, vol 563. Springer, Cham. https://doi.org/10.1007/978-3-319-27653-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-27653-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27652-6
Online ISBN: 978-3-319-27653-3
eBook Packages: Computer ScienceComputer Science (R0)