Skip to main content

EASY: Evaluation System for Summarization

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2019)

Abstract

Automatic text summarization aims at producing a shorter version of a document (or a document set). Extractive summarizers compile summaries by extracting a subset of sentences from a given text, while abstractive summarizers generate new sentences. Both types of summarizers strive to preserve the meaning of the original document as much as possible. Evaluation of summarization quality is a challenging task. Due the expense of human evaluations, many researchers prefer to evaluate their systems automatically, with help of software tools. Automatic evaluations are usually performed to provide comparisons between a system-generated summary and one or more human-written summaries, according to selected measures. However, a single metric cannot reflect all quality-related aspects of a summary. For instance, evaluation of an extractive summarizer by comparing, at word level, its summaries to the abstracts written by humans is not good enough. This is so because the summaries being compared do not necessarily use the same vocabulary. Also, considering only single words does not reflect the coherency or readability of a generated summary. Multiple tools and metrics have been proposed in literature for evaluating the quality of summarizers. However, studies show that correlations between these metrics do not always hold. In this paper we present the EvAluation SYstem for Summarization (EASY), which enables the evaluation of summaries with several quality measures. The EASY system can also compare system-generated summaries to the extractive summaries produced by the OCCAMS baseline, which is considered the best possible extractive summarizer. EASY currently supports two languages–English and French–and is freely available online for the NLP community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://youtu.be/5AhZB5OfxN8.

References

  1. Abdi, A., Idris, N.: Automated summarization assessment system: quality assessment without a reference summary. In: The International Conference on Advances in Applied Science and Environmental Engineering-ASEE (2014)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    Google Scholar 

  3. Cohan, A., Goharian, N.: Revisiting summarization evaluation for scientific articles. arXiv preprint arXiv:1604.00400 (2016)

  4. Das, D., Martins, A.F.: A survey on automatic text summarization. Lit. Surv. Lang. Stat. II Course CMU 4, 192–195 (2007)

    Google Scholar 

  5. Davis, S.T., Conroy, J.M., Schlesinger, J.D.: OCCAMS-an optimal combinatorial covering algorithm for multi-document summarization. In: 2012 IEEE 12th International Conference on Data Mining Workshops (ICDMW), pp. 454–463. IEEE (2012)

    Google Scholar 

  6. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  7. Donaway, R.L., Drummey, K.W., Mather, L.A.: A comparison of rankings produced by summarization evaluation measures. In: Proceedings of the 2000 NAACL-ANLP Workshop on Automatic summarization, pp. 69–78. Association for Computational Linguistics (2000)

    Google Scholar 

  8. Gambhir, M., Gupta, V.: Recent automatic text summarization techniques: a survey. Artif. Intell. Rev. 47(1), 1–66 (2017)

    Article  Google Scholar 

  9. Giannakopoulos, G., Karkaletsis, V.: Autosummeng and memog in evaluating guided summaries. In: Proceedings of Text Analysis Conference (2011)

    Google Scholar 

  10. Giannakopoulos, G., et al.: Multiling 2015: multilingual summarization of single and multi-documents, on-line fora, and call-center conversations. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 270–274 (2015)

    Google Scholar 

  11. Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)

    Google Scholar 

  12. Hovy, E., Lin, C.Y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic elements. In: Proceedings of the Fifth Conference on Language Resources and Evaluation (LREC 2006), pp. 604–611. Citeseer (2006)

    Google Scholar 

  13. Jing, H., Barzilay, R., McKeown, K., Elhadad, M.: Summarization evaluation methods: experiments and analysis. In: AAAI Symposium on Intelligent Summarization, Palo Alto, CA, pp. 51–59 (1998)

    Google Scholar 

  14. Jing, H., McKeown, K.R.: The decomposition of human-written summary sentences. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129–136. ACM (1999)

    Google Scholar 

  15. Jones, K.S., Galliers, J.R.: Evaluating Natural Language Processing Systems: An Analysis and Review, vol. 1083. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0027470

    Book  Google Scholar 

  16. Karger, D.R.: A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM Rev. 43(3), 499–522 (2001)

    Article  Google Scholar 

  17. Kasture, N., Yargal, N., Singh, N.N., Kulkarni, N., Mathur, V.: A survey on methods of abstractive text summarization. Int. J. Res. Merg. Sci. Technol 1(6), 53–57 (2014)

    Google Scholar 

  18. Khuller, S., Moss, A., Naor, J.S.: The budgeted maximum coverage problem. Inf. Process. Lett. 70(1), 39–45 (1999)

    Article  Google Scholar 

  19. Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. ACM (1995)

    Google Scholar 

  20. Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)

    Google Scholar 

  21. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  22. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), pp. 25–26 (2004)

    Google Scholar 

  23. Lloret, E., Vodolazova, T., Moreda, P., Muñoz, R., Palomar, M.: Are better summaries also easier to understand? analyzing text complexity in automatic summarization. In: Litvak, M., Vanetik, N. (eds.) Multilingual Text Analysis: Challenges, Models, and Approaches, chap. 10. World Scientific (2019)

    Google Scholar 

  24. Mani, I.: Summarization evaluation: an overview (2001)

    Google Scholar 

  25. Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T., Sundheim, B.: Summac: a text summarization evaluation. Nat. Lang. Eng. 8(1), 43–68 (2002)

    Article  Google Scholar 

  26. McDonald, R.: A study of global inference algorithms in multi-document summarization. In: Advances in Information Retrieval, pp. 557–564 (2007)

    Google Scholar 

  27. Merlino, A., Maybury, M.: An Empirical Study of the Optimal Presentation of Multimedia Summaries of Broadcast News. MIT Press, Cambridge (1999)

    Google Scholar 

  28. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (2004)

    Google Scholar 

  29. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  30. Nanba, H., Okumura, M.: Producing more readable extracts by revising them. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 2, pp. 1071–1075. Association for Computational Linguistics (2000)

    Google Scholar 

  31. Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_3

    Chapter  Google Scholar 

  32. Nenkova, A., McKeown, K., et al.: Automatic summarization. Found. Trends® Inf. Retrieval 5(2–3), 103–233 (2011)

    Google Scholar 

  33. Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: the pyramid method. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)

    Google Scholar 

  34. Ng, J.P., Abrecht, V.: Better summarization evaluation with word embeddings for rouge. arXiv preprint arXiv:1508.06034 (2015)

  35. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  36. Pastra, K., Saggion, H.: Colouring summaries Bleu. In: Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: are Evaluation Methods, Metrics and Resources Reusable?, pp. 35–42. Association for Computational Linguistics (2003)

    Google Scholar 

  37. Pittaras, N., Montanelliy, S., Giannakopoulos, G., Ferraray, A., Karkaletsis, V.: Crowdsourcing in single-document summary evaluation: the argo way. In: Litvak, M., Vanetik, N. (eds.) Multilingual Text Analysis: Challenges, Models, and Approaches, chap. 8. World Scientific (2019)

    Google Scholar 

  38. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)

    Google Scholar 

  39. Sasaki, Y., et al.: The truth of the f-measure. Teach Tutor Mater 1(5), 1–5 (2007)

    Google Scholar 

  40. Steinberger, J., Ježek, K.: Text summarization and singular value decomposition. In: Yakhno, T. (ed.) ADVIS 2004. LNCS, vol. 3261, pp. 245–254. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30198-1_25

    Chapter  Google Scholar 

  41. Steinberger, J., Ježek, K.: Evaluation measures for text summarization. Comput. Inform. 28(2), 251–275 (2012)

    Google Scholar 

  42. Well, A.D., Myers, J.L.: Research Design & Statistical Analysis. Psychology Press, London (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalia Vanetik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Litvak, M., Vanetik, N., Veksler, Y. (2023). EASY: Evaluation System for Summarization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24340-0_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24339-4

  • Online ISBN: 978-3-031-24340-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics