Abstract
In this paper we present a study of the usage of terminology in the biomedical literature, with the main aim to indicate phenomena that can be helpful for automatic term recognition in the domain. Our analysis is based on the terminology appearing in the Genia corpus. We analyse the usage of biomedical terms and their variants (namely inflectional and orthographic alternatives, terms with prepositions, coordinated terms, etc.), showing the variability and dynamic nature of terms used in biomedical abstracts. Term coordination and terms containing prepositions are analysed in detail. We also show that there is a discrepancy between terms used in the literature and terms listed in controlled dictionaries. In addition, we briefly evaluate the effectiveness of incorporating treatment of different types of term variation into an automatic term recognition system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ananiadou, S.: A Methodology for Automatic Term Recognition. In: Proc. of COLING, pp. 1034–1038 (1994)
Chang, J., Schutze, H., Altman, R.: Creating an Online Dictionary of Abbreviations from Medline. Journal of the American Medical Informatics Association 9(6), 612–620 (2002)
Frantzi, K., Ananiadou, S., Mima, H.: Automatic Recognition of Multi-word Terms: the Cvalue/ NC-value Method. Int. J. on Digital Libraries. 3(2), 115–130 (2000)
Hirschman, L., Friedman, C., McEntire, R., Wu, C.: Linking Biological Language Information and Knowledge. In: Proc. of PSB 2003 (the introduction to the BioNLP track) (2003)
Jacquemin, C.: Spotting and Discovering Terms through NLP. MIT Press, Cambridge (2001)
Krauthammer, M., Nenadic, G.: Term Identification in the Biomedical Literature. Journal of Biomedical Informatics (2004) (in press)
Lander, E.S., et al. (International Human Genome Sequencing Consortium): Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921
Larkey, L., Ogilvie, P., Price, A., Tamilio, B.: Acrophile: An Automated Acronym Extractor and Server. In: Proc. of ACM Digital Libraries, pp. 205–214 (2000)
Liu, H., Aronson, A.R., Friedman, C.: A study of abbreviations in Medline abstracts. In: Proc. of AMIA Symposium 2002, pp. 464–468 (2002)
Maynard, D., Ananiadou, S.: TRUCKS: A Model for Automatic Multi-Word Term Recognition. Journal of Natural Language Processing 8(1), 101–125 (2000)
Nenadic, G., Spasic, I., Ananiadou, S.: Automatic Acronym Acquisition and Term Variation Management within Domain-Specific Texts. In: Proc. of LREC-3, pp. 2155–2162 (2002)
Nenadic, G., Ananiadou, S., McNaught, J.: Enhancing automatic term recognition through recognition of variation. In: Proc. of COLING 2004 (2004) (in press)
Ogren, P., Cohen, K., Acquaah-Mensah, G., Eberlein, J., Hunter, L.: The Compositional Structure of Gene Ontology Terms. In: Proc. of PSB, pp. 214–225 (2004)
Ohta, T., Tateisi, Y., Kim, J., Mima, H., Tsujii, J.: Genia Corpus: an Annotated Research Abstract Corpus in Molecular Biology Domain. In: Proc. of HLT 2002, pp. 73–77 (2002)
Pustejovsky, J., Castaño, J., Cochran, B., Kotecki, M., Morrell, M., Rumshisky, A.: Extraction and Disambiguation of Acronym-Meaning Pairs in Medline. In: Proc. of Medinfo (2001)
Pustejovsky, J., Castaño, J., Zhang, J., Kotecki, M., Cochran, B.: Robust Relational Parsing Over Biomedical Literature: Extracting Inhibit Relations. In: Proc. of PSB 2002, pp. 362–373 (2002)
Rimer, M., O’Connell, M.: BioABACUS: a database of abbreviations and acronyms in biotechnology and computer science. Bioinformatics 14(10), 888–889 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nenadic, G., Spasic, I., Ananiadou, S. (2005). Mining Biomedical Abstracts: What’s in a Term?. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_85
Download citation
DOI: https://doi.org/10.1007/978-3-540-30211-7_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24475-2
Online ISBN: 978-3-540-30211-7
eBook Packages: Computer ScienceComputer Science (R0)