EASY: Evaluation System for Summarization

Litvak, Marina; Vanetik, Natalia; Veksler, Yael

doi:10.1007/978-3-031-24340-0_40

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

500 Accesses

Abstract

Automatic text summarization aims at producing a shorter version of a document (or a document set). Extractive summarizers compile summaries by extracting a subset of sentences from a given text, while abstractive summarizers generate new sentences. Both types of summarizers strive to preserve the meaning of the original document as much as possible. Evaluation of summarization quality is a challenging task. Due the expense of human evaluations, many researchers prefer to evaluate their systems automatically, with help of software tools. Automatic evaluations are usually performed to provide comparisons between a system-generated summary and one or more human-written summaries, according to selected measures. However, a single metric cannot reflect all quality-related aspects of a summary. For instance, evaluation of an extractive summarizer by comparing, at word level, its summaries to the abstracts written by humans is not good enough. This is so because the summaries being compared do not necessarily use the same vocabulary. Also, considering only single words does not reflect the coherency or readability of a generated summary. Multiple tools and metrics have been proposed in literature for evaluating the quality of summarizers. However, studies show that correlations between these metrics do not always hold. In this paper we present the EvAluation SYstem for Summarization (EASY), which enables the evaluation of summaries with several quality measures. The EASY system can also compare system-generated summaries to the extractive summaries produced by the OCCAMS baseline, which is considered the best possible extractive summarizer. EASY currently supports two languages–English and French–and is freely available online for the NLP community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recent automatic text summarization techniques: a survey

Article 29 March 2016

Evaluating Text Summarization Systems with a Fair Baseline from Multiple Reference Summaries

CNewSum: A Large-Scale Summarization Dataset with Human-Annotated Adequacy and Deducibility Level

Notes

1.
https://youtu.be/5AhZB5OfxN8.

References

Abdi, A., Idris, N.: Automated summarization assessment system: quality assessment without a reference summary. In: The International Conference on Advances in Applied Science and Environmental Engineering-ASEE (2014)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Google Scholar
Cohan, A., Goharian, N.: Revisiting summarization evaluation for scientific articles. arXiv preprint arXiv:1604.00400 (2016)
Das, D., Martins, A.F.: A survey on automatic text summarization. Lit. Surv. Lang. Stat. II Course CMU 4, 192–195 (2007)
Google Scholar
Davis, S.T., Conroy, J.M., Schlesinger, J.D.: OCCAMS-an optimal combinatorial covering algorithm for multi-document summarization. In: 2012 IEEE 12th International Conference on Data Mining Workshops (ICDMW), pp. 454–463. IEEE (2012)
Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Article Google Scholar
Donaway, R.L., Drummey, K.W., Mather, L.A.: A comparison of rankings produced by summarization evaluation measures. In: Proceedings of the 2000 NAACL-ANLP Workshop on Automatic summarization, pp. 69–78. Association for Computational Linguistics (2000)
Google Scholar
Gambhir, M., Gupta, V.: Recent automatic text summarization techniques: a survey. Artif. Intell. Rev. 47(1), 1–66 (2017)
Article Google Scholar
Giannakopoulos, G., Karkaletsis, V.: Autosummeng and memog in evaluating guided summaries. In: Proceedings of Text Analysis Conference (2011)
Google Scholar
Giannakopoulos, G., et al.: Multiling 2015: multilingual summarization of single and multi-documents, on-line fora, and call-center conversations. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 270–274 (2015)
Google Scholar
Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)
Google Scholar
Hovy, E., Lin, C.Y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic elements. In: Proceedings of the Fifth Conference on Language Resources and Evaluation (LREC 2006), pp. 604–611. Citeseer (2006)
Google Scholar
Jing, H., Barzilay, R., McKeown, K., Elhadad, M.: Summarization evaluation methods: experiments and analysis. In: AAAI Symposium on Intelligent Summarization, Palo Alto, CA, pp. 51–59 (1998)
Google Scholar
Jing, H., McKeown, K.R.: The decomposition of human-written summary sentences. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129–136. ACM (1999)
Google Scholar
Jones, K.S., Galliers, J.R.: Evaluating Natural Language Processing Systems: An Analysis and Review, vol. 1083. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0027470
Book Google Scholar
Karger, D.R.: A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM Rev. 43(3), 499–522 (2001)
Article Google Scholar
Kasture, N., Yargal, N., Singh, N.N., Kulkarni, N., Mathur, V.: A survey on methods of abstractive text summarization. Int. J. Res. Merg. Sci. Technol 1(6), 53–57 (2014)
Google Scholar
Khuller, S., Moss, A., Naor, J.S.: The budgeted maximum coverage problem. Inf. Process. Lett. 70(1), 39–45 (1999)
Article Google Scholar
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. ACM (1995)
Google Scholar
Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
Google Scholar
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), pp. 25–26 (2004)
Google Scholar
Lloret, E., Vodolazova, T., Moreda, P., Muñoz, R., Palomar, M.: Are better summaries also easier to understand? analyzing text complexity in automatic summarization. In: Litvak, M., Vanetik, N. (eds.) Multilingual Text Analysis: Challenges, Models, and Approaches, chap. 10. World Scientific (2019)
Google Scholar
Mani, I.: Summarization evaluation: an overview (2001)
Google Scholar
Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T., Sundheim, B.: Summac: a text summarization evaluation. Nat. Lang. Eng. 8(1), 43–68 (2002)
Article Google Scholar
McDonald, R.: A study of global inference algorithms in multi-document summarization. In: Advances in Information Retrieval, pp. 557–564 (2007)
Google Scholar
Merlino, A., Maybury, M.: An Empirical Study of the Optimal Presentation of Multimedia Summaries of Broadcast News. MIT Press, Cambridge (1999)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (2004)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Nanba, H., Okumura, M.: Producing more readable extracts by revising them. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 2, pp. 1071–1075. Association for Computational Linguistics (2000)
Google Scholar
Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_3
Chapter Google Scholar
Nenkova, A., McKeown, K., et al.: Automatic summarization. Found. Trends® Inf. Retrieval 5(2–3), 103–233 (2011)
Google Scholar
Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: the pyramid method. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)
Google Scholar
Ng, J.P., Abrecht, V.: Better summarization evaluation with word embeddings for rouge. arXiv preprint arXiv:1508.06034 (2015)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Pastra, K., Saggion, H.: Colouring summaries Bleu. In: Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: are Evaluation Methods, Metrics and Resources Reusable?, pp. 35–42. Association for Computational Linguistics (2003)
Google Scholar
Pittaras, N., Montanelliy, S., Giannakopoulos, G., Ferraray, A., Karkaletsis, V.: Crowdsourcing in single-document summary evaluation: the argo way. In: Litvak, M., Vanetik, N. (eds.) Multilingual Text Analysis: Challenges, Models, and Approaches, chap. 8. World Scientific (2019)
Google Scholar
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)
Google Scholar
Sasaki, Y., et al.: The truth of the f-measure. Teach Tutor Mater 1(5), 1–5 (2007)
Google Scholar
Steinberger, J., Ježek, K.: Text summarization and singular value decomposition. In: Yakhno, T. (ed.) ADVIS 2004. LNCS, vol. 3261, pp. 245–254. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30198-1_25
Chapter Google Scholar
Steinberger, J., Ježek, K.: Evaluation measures for text summarization. Comput. Inform. 28(2), 251–275 (2012)
Google Scholar
Well, A.D., Myers, J.L.: Research Design & Statistical Analysis. Psychology Press, London (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Shamoon College of Engineering, 84100, Beer Sheva, Israel
Marina Litvak, Natalia Vanetik & Yael Veksler

Authors

Marina Litvak
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Vanetik
View author publications
You can also search for this author in PubMed Google Scholar
Yael Veksler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalia Vanetik .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Litvak, M., Vanetik, N., Veksler, Y. (2023). EASY: Evaluation System for Summarization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_40

Download citation

DOI: https://doi.org/10.1007/978-3-031-24340-0_40
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EASY: Evaluation System for Summarization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Recent automatic text summarization techniques: a survey

Evaluating Text Summarization Systems with a Fair Baseline from Multiple Reference Summaries

CNewSum: A Large-Scale Summarization Dataset with Human-Annotated Adequacy and Deducibility Level

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

EASY: Evaluation System for Summarization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Recent automatic text summarization techniques: a survey

Evaluating Text Summarization Systems with a Fair Baseline from Multiple Reference Summaries

CNewSum: A Large-Scale Summarization Dataset with Human-Annotated Adequacy and Deducibility Level

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation