Abstract
Automated evaluation is crucial in the context of automated text summaries, as is the case with evaluation of any of the language technologies. In this paper we present a Generative Modeling framework for evaluation of content of summaries. We used two simple alternatives to identifying signature-terms from the reference summaries based on model consistency and Parts-Of-Speech (POS) features. By using a Generative Modeling approach we capture the sentence level presence of these signature-terms in peer summaries. We show that parts-of-speech such as noun and verb, give simple and robust method to signature-term identification for the Generative Modeling approach. We also show that having a large set of ’significant signature-terms’ is better than a small set of ‘strong signature-terms’ for our approach. Our results show that the generative modeling approach is indeed promising — providing high correlations with manual evaluations — and further investigation of signature-term identification methods would obtain further better results. The efficacy of the approach can be seen from its ability to capture ‘overall responsiveness’ much better than the state-of-the-art in distinguishing a human from a system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hirschman, L., Mani, I.: Evaluation (2001)
Jones, K.S., Galliers, J.R.: Evaluating Natural Language Processing Systems: An Analysis and Review. Springer, New York (1996)
Jing, H., Barzilay, R., Mckeown, K., Elhadad, M.: Summarization evaluation methods: Experiments and analysis. In: AAAI Symposium on Intelligent Summarization, pp. 60–68 (1998)
Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: NAACL 2003: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Morristown, NJ, USA, pp. 71–78. Association for Computational Linguistics (2003)
Lin, C.-Y., Hovy, E.: The potential and limitations of automatic sentence extraction for summarization. In: Proceedings of the HLT-NAACL 2003 on Text summarization workshop, Morristown, NJ, USA, pp. 73–80. Association for Computational Linguistics (2003)
Marcu, D.: The automatic construction of large-scale corpora for summarization research, University of California, Berkely, pp. 137–144 (1999)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: The proceedings of ACL Workshop on Text Summarization Branches Out. ACL (2004)
Nenkova, A., Passonneau, R., McKeown, K.: The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4 (2007)
Hovy, E., Lin, C.-y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic elements. In: Proceedings of the Fifth Conference on Language Resources and Evaluation, LREC (2006)
Tratz, S., Hovy, E.: Summarization evaluation using transformed basic elements. In: Proceedings of Text Analysis Conference (2008)
Ani, A.H., Nenkova, A., Passonneau, R., Rambow, O.: Automation of summary evaluation by the pyramid method. In: Proceedings of the Conference of Recent Advances in Natural Language Processing (RANLP), p. 226 (2005)
Louis, A., Nenkova, A.: Automatically evaluating content selection in summarization without human models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 306–314. Association for Computational Linguistics (2009)
Katragadda, R., Varma, V.: Query-focused summaries or query-biased summaries ? In: Proceedings of the joint conference of the 47th Annual meeting of the Association of Computational Linguistics and the 4th meeting of International Joint Conference on Natural Language Processing, ACL-IJCNLP 2009. Association of Computational Linguistics (2009)
Gupta, S., Nenkova, A., Jurafsky, D.: Measuring importance and query relevance in topic-focused multi-document summarization. ACL companion volume, 2007 (2007)
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of english: the penn treebank. Comput. Linguist. 19, 313–330 (1993)
Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora, Morristown, NJ, USA, pp. 63–70. Association for Computational Linguistics (2000)
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: NAACL 2003: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Morristown, NJ, USA, pp. 173–180. Association for Computational Linguistics (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Katragadda, R. (2010). GEMS: Generative Modeling for Evaluation of Summaries. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_61
Download citation
DOI: https://doi.org/10.1007/978-3-642-12116-6_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12115-9
Online ISBN: 978-3-642-12116-6
eBook Packages: Computer ScienceComputer Science (R0)