Skip to main content

Data-Driven Unsupervised Evaluation of Automatic Text Summarization Systems

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Its Applications (MICAI 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9414))

Included in the following conference series:

  • 1439 Accesses

Abstract

Automatic text summarization is a text compression problem with many applications in natural language processing. In this paper we focus the problem of the evaluation of text summarization system. We propose an unsupervised approach based on keywords: it does not require large amount of manual processing and can be implemented as a fully automatic procedure. We also conduct a series of experiments with naïve informants and professional experts. The results of the experiments with informants, experts and automatically extracted keywords confirm that keywords, as one of the types of text compression, can be successfully used for the evaluation of summaries quality. Our data is represented by (but not restricted to) different types of Russian news texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    These terms (precision, etc.), well-known in NLP community, should be interpreted in a different way here: they represent metrics by which experts estimate the quality of summaries, rather than automatically calculated quality measures. For example, in [5] experts assign each summary precision and redundancy values from the rating scale.

  2. 2.

    There initially 25 articles given to each of the informants, ant they were asked to write down the words. However, after we received the first answers, we decided to change the instruction: we asked the informants to underline the words in the text. It helped us to avoid misprints and to maintain information about the positions of the words in the text. We also reduced the number of articles to 12, because the preliminary results showed that the number of errors invariably increased with the growth of the number of articles.

References

  1. Hennig, L., De Luca, E.W., Albayrak, S.: Learning summary content units with topic modeling. In: COLING 2010: Poster Volume, pp. 391–399 (2010)

    Google Scholar 

  2. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 157–165 (1958)

    Google Scholar 

  3. Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: the pyramid method. In: HLT-NAACL 2004: Main Proceedings, pp. 145–152 (2004)

    Google Scholar 

  4. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: Proceedings of the Third Text REtrieval Conference (TREC 1994) (1994)

    Google Scholar 

  5. Solov’ev, A.N., Antonova, A.J., Pazel’skaja, A.G.: Using sentiment-analysis for text information extraction. In: Computational Linguistics and Intelligent Technology: According to the Materials of the Annual International Conference “Dialogue” vol. 11, no. 18: В 2т. Т. 1: The Main Program of the Conference, pp. 616–627. Publishing House of the Russian State Humanitarian University (2012)

    Google Scholar 

  6. Yagunova, E.V., Makarova, O.E., Antonov, A.Y., Solovyov, A.N.: Various compression methods in the study of understanding the text of the news. In: Understanding in Communication. Man in the Information Space, vol. 2, pp. 414–421. Publishing House of YAGPU, Yaroslavl – Moscow (2012)

    Google Scholar 

  7. Lenta.ru: Rambler Media Group. http://www.lenta.ru

  8. TextAnalyst. http://www.analyst.ru/index.php?lang=eng&dir=content/products/&id=ta

Download references

Acknowledgements

The authors acknowledge Saint-Petersburg State University for the research grant 30.38.305.2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena Yagunova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Yagunova, E., Makarova, O., Pronoza, E. (2015). Data-Driven Unsupervised Evaluation of Automatic Text Summarization Systems. In: Pichardo Lagunas, O., Herrera Alcántara, O., Arroyo Figueroa, G. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2015. Lecture Notes in Computer Science(), vol 9414. Springer, Cham. https://doi.org/10.1007/978-3-319-27101-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27101-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27100-2

  • Online ISBN: 978-3-319-27101-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics