Skip to main content

A Heuristic Strategy for Extracting Terms from Scientific Texts

  • Conference paper
  • First Online:
Analysis of Images, Social Networks and Texts (AIST 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 542))

Abstract

The paper describes a strategy that applies heuristics to combine sets of terminological words and words combination pre-extracted from a scientific text by several term recognition procedures. Each procedure is based on a collection of lexico-syntactic patterns representing specific linguistic information about terms within scientific texts. Our strategy is aimed to improve the quality of automatic term extraction from a particular scientific text. The experiments have shown that the strategy gives 11–17 % increase of F-measure compared with the commonly-used methods of term extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arora, C., Sabetzadeh, M., Briand, L., Zimmer, F.: Improving requirements glossary construction via clustering: approach and industrial case studies. In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, New York, NY (2014)

    Google Scholar 

  2. Bolshakova, E.I.: Recognition of author’s scientific and technical terms. In: Gelbukh, A. (ed.) CICLing 2001. LNCS, vol. 2004, pp. 281–290. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  3. Bolshakova, E., Efremova, N., Noskov, A.: LSPL-patterns as a tool for information extraction from natural language texts. In: Markov, K., Ryazanov, V., Velychko, V., Aslanyan, L. (eds.) New Trends in Classification and Data Mining, pp. 110–118. ITHEA, Sofia (2010)

    Google Scholar 

  4. Bosma, W., Vossen, P.: Bootstrapping language neutral term extraction. In: Proceedings of the 7th Language Resources and Evaluation Conference, pp. 2277–2282. LREC, Valetta (2010)

    Google Scholar 

  5. Castellvi, M., Bagot, R., Palatresi, J.: Automatic term detection: a review of current systems. In: Bourigault, D., Jacquemin, C., L’Homme, M.-C. (eds.) Recent Advances in Computational Terminology, pp. 53–87. John Benjamins, Amsterdam (2001)

    Chapter  Google Scholar 

  6. Csomai, A., Mihalcea, R.: Investigations in unsupervised back-of-the-book indexing. In: Proceedings of the Florida Artificial Intelligence Research Society Conference, pp. 211–216 (2007)

    Google Scholar 

  7. Dobrov, B., Loukachevich, N., Syromiatnikov, S.: Forming base of terminological word combinations from problem oriented texts. In: Proceedings of the 5th Russian Scientific Conference “Digital Libraries: Perspective Methods and Technologies, Electronic Collections”, pp. 201–210 (2003) (in Russian)

    Google Scholar 

  8. Efremova, N.E.: Methods and Programming Tools for Extraction of Terminological Information from Scientific and Technical Texts. PhD Thesis, Lomonosov Moscow State University (2013) (in Russian)

    Google Scholar 

  9. Frantzi, K., Ananiadou, S., Mima, H.: Automatic Recognition of Multi-Word Terms: The C-value/NC-value method. In: Nikolau, C. et al. (Eds.) International Journal on Digital Libraries, vol. 3(2), pp. 115–130 (2000)

    Google Scholar 

  10. Jacquemin, C., Tsoukermann, E.: NLP for term variant extraction: synergy between morphology, lexicon, and syntax. In: Strzalkowski, T. (ed.) Natural Language Information Retrieval, pp. 25–74. Kluwer Academic Publishers, Dordrecht (1999)

    Chapter  Google Scholar 

  11. Korkontzelos, I., Ananiadou, S.: Term extraction. In: Oxford Handbook of Computational Linguistics (2nd Ed.). Oxford University Press, Oxford (2014)

    Google Scholar 

  12. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  MATH  Google Scholar 

  13. Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13(1), 157–169 (2004)

    Article  Google Scholar 

  14. Nenadic, G., Ananiadou, S., McNaught, J.: Enhancing automatic term recognition through recognition of variation. In: Proceedings of 20th International Conference on Computational Linguistics COLING 2004, pp. 604–610. Morristown, NJ (2004)

    Google Scholar 

  15. Nokel, M.A., Bolshakova, E.I., Loukachevich, N.V.: Combining multiple features for single-word term extraction. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 1, no. 11 pp. 490–501. RGGU, Moscow (2012)

    Google Scholar 

  16. Paice, C.D., Jones P.A.: The identification of important concepts in highly structured technical papers. In: Korfhage, R., Rasmussen, E., Willett, P. (eds.) Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pp. 69–78. ACM, Pittsburgh, PA (1993)

    Google Scholar 

  17. Smadja, F., McKeown, K.: Automatically extracting and representing collocations for language generation. In: Proceedings of the 28th Annual Meeting on Association for Computational Linguistics, pp. 252–259. ACL, Pittsburgh, PA (1990)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous reviewers of our paper for their helpful and constructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena I. Bolshakova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bolshakova, E.I., Efremova, N.E. (2015). A Heuristic Strategy for Extracting Terms from Scientific Texts. In: Khachay, M., Konstantinova, N., Panchenko, A., Ignatov, D., Labunets, V. (eds) Analysis of Images, Social Networks and Texts. AIST 2015. Communications in Computer and Information Science, vol 542. Springer, Cham. https://doi.org/10.1007/978-3-319-26123-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26123-2_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26122-5

  • Online ISBN: 978-3-319-26123-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics