Skip to main content

Mining Biomedical Abstracts: What’s in a Term?

  • Conference paper
Natural Language Processing – IJCNLP 2004 (IJCNLP 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Included in the following conference series:

Abstract

In this paper we present a study of the usage of terminology in the biomedical literature, with the main aim to indicate phenomena that can be helpful for automatic term recognition in the domain. Our analysis is based on the terminology appearing in the Genia corpus. We analyse the usage of biomedical terms and their variants (namely inflectional and orthographic alternatives, terms with prepositions, coordinated terms, etc.), showing the variability and dynamic nature of terms used in biomedical abstracts. Term coordination and terms containing prepositions are analysed in detail. We also show that there is a discrepancy between terms used in the literature and terms listed in controlled dictionaries. In addition, we briefly evaluate the effectiveness of incorporating treatment of different types of term variation into an automatic term recognition system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ananiadou, S.: A Methodology for Automatic Term Recognition. In: Proc. of COLING, pp. 1034–1038 (1994)

    Google Scholar 

  2. Chang, J., Schutze, H., Altman, R.: Creating an Online Dictionary of Abbreviations from Medline. Journal of the American Medical Informatics Association 9(6), 612–620 (2002)

    Article  Google Scholar 

  3. Frantzi, K., Ananiadou, S., Mima, H.: Automatic Recognition of Multi-word Terms: the Cvalue/ NC-value Method. Int. J. on Digital Libraries. 3(2), 115–130 (2000)

    Article  Google Scholar 

  4. Hirschman, L., Friedman, C., McEntire, R., Wu, C.: Linking Biological Language Information and Knowledge. In: Proc. of PSB 2003 (the introduction to the BioNLP track) (2003)

    Google Scholar 

  5. Jacquemin, C.: Spotting and Discovering Terms through NLP. MIT Press, Cambridge (2001)

    Google Scholar 

  6. Krauthammer, M., Nenadic, G.: Term Identification in the Biomedical Literature. Journal of Biomedical Informatics (2004) (in press)

    Google Scholar 

  7. Lander, E.S., et al. (International Human Genome Sequencing Consortium): Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921

    Google Scholar 

  8. Larkey, L., Ogilvie, P., Price, A., Tamilio, B.: Acrophile: An Automated Acronym Extractor and Server. In: Proc. of ACM Digital Libraries, pp. 205–214 (2000)

    Google Scholar 

  9. Liu, H., Aronson, A.R., Friedman, C.: A study of abbreviations in Medline abstracts. In: Proc. of AMIA Symposium 2002, pp. 464–468 (2002)

    Google Scholar 

  10. Maynard, D., Ananiadou, S.: TRUCKS: A Model for Automatic Multi-Word Term Recognition. Journal of Natural Language Processing 8(1), 101–125 (2000)

    Google Scholar 

  11. Nenadic, G., Spasic, I., Ananiadou, S.: Automatic Acronym Acquisition and Term Variation Management within Domain-Specific Texts. In: Proc. of LREC-3, pp. 2155–2162 (2002)

    Google Scholar 

  12. Nenadic, G., Ananiadou, S., McNaught, J.: Enhancing automatic term recognition through recognition of variation. In: Proc. of COLING 2004 (2004) (in press)

    Google Scholar 

  13. Ogren, P., Cohen, K., Acquaah-Mensah, G., Eberlein, J., Hunter, L.: The Compositional Structure of Gene Ontology Terms. In: Proc. of PSB, pp. 214–225 (2004)

    Google Scholar 

  14. Ohta, T., Tateisi, Y., Kim, J., Mima, H., Tsujii, J.: Genia Corpus: an Annotated Research Abstract Corpus in Molecular Biology Domain. In: Proc. of HLT 2002, pp. 73–77 (2002)

    Google Scholar 

  15. Pustejovsky, J., Castaño, J., Cochran, B., Kotecki, M., Morrell, M., Rumshisky, A.: Extraction and Disambiguation of Acronym-Meaning Pairs in Medline. In: Proc. of Medinfo (2001)

    Google Scholar 

  16. Pustejovsky, J., Castaño, J., Zhang, J., Kotecki, M., Cochran, B.: Robust Relational Parsing Over Biomedical Literature: Extracting Inhibit Relations. In: Proc. of PSB 2002, pp. 362–373 (2002)

    Google Scholar 

  17. Rimer, M., O’Connell, M.: BioABACUS: a database of abbreviations and acronyms in biotechnology and computer science. Bioinformatics 14(10), 888–889 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nenadic, G., Spasic, I., Ananiadou, S. (2005). Mining Biomedical Abstracts: What’s in a Term?. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_85

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30211-7_85

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24475-2

  • Online ISBN: 978-3-540-30211-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics