Abstract
This paper discusses an approach to topic-oriented multi-document summarization. It investigates the effectiveness of using additional information about the document set as a whole, as well as individual documents. We present NEO-CORTEX, a multi-document summarization system based on the existing CORTEX system. Results are reported for experiments with a document base formed by the NIST DUC-2005 and DUC-2006 data. Our experiments have shown that NEO-CORTEX is an effective system and achieves good performance on topic-oriented multi-document summarization task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)
Mani, I.: Automatic Summarization. John Benjamins, Amsterdam (2001)
Luhn, P.H.: Automatic creation of literature abstracts. IBM Journal of Research and Development, 155–164 (1958)
Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the ACM (JACM) 16(2), 264–285 (1969)
Paice, C.D.: Constructing literature abstracts by computer: techniques and prospects. Inf. Process. Manage. 26(1), 171–186 (1990)
Mani, I., Bloedorn, E.: Machine Learning of Generic and User-Focused Summarization. Arxiv preprint cs (CL/9811006) (1998)
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 68–73. ACM Press, New York (1995)
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, pp. 10–17 (1997)
Stairmand, M.: A Computational Analysis of Lexical Cohesion with Applications in Information Retrieval. Unpublished PhD Thesis. UMIST Computational Linguistics Laboratory (1996)
Mann, W., Thompson, S.: Rhetorical Structure Theory: A Theory of Text Organization. University of Southern California, Information Sciences Institute (1987)
Torres-Moreno, J.-M., Velázquez-Morales, P., Meunier, J.G.: Cortex: un algorithme pour la condensation automatique de textes. ARCo 2, 365 (2001)
Torres-Moreno, J.-M., Velázquez-Morales, P., Meunier, J.G.: Condensés de textes par des méthodes numériques. JADT 2, 723–734 (2002)
Abdillahi, N., Nocera, P., Torres-Moreno, J.-M.: Boîtes à outils TAL pour les langues peu informatisées: Le cas du somali. JADT, 697–705 (2006)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Salton, G.: In: Automatic text processing, Addison-Wesley Publishing, Reading (1989)
Salton, G., McGill, M.: Introduction to modern information retrieval. Computer Science Series. McGraw-Hill, New York (1983)
Passonneau, R.J., Nenkova, A., McKeown, K., Sigleman, S.: Applying the Pyramid Method in DUC 2005. In: Proc. of DUC 2005 at the Human Language Technology Conf./Conf. on Empirical Methods in Natural Language Processing (HLT/EMNLP) (2005)
Hovy, E., Lin, C.Y., Zhou, L.: Evaluating DUC 2005 using Basic Elements. In: Proc. of DUC 2005 at the Human Language Technology Conf./Conf. on Empirical Methods in Natural Language Processing (HLT/EMNLP) (2005)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. Technical report, Information Sciences Institute (2002)
Favre, B., Béchet, F., Bellot, P., Boudin, F., El-Bèze, M., Gillard, L., Lapalme, G., Torres-Moreno, J.M.: The LIA-Thales summarization system at DUC-2006 (2006), http://www-nlpir.nist.gov/projects/duc/index.html
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Boudin, F., Torres Moreno, J.M. (2007). NEO-CORTEX: A Performant User-Oriented Multi-Document Summarization System. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2007. Lecture Notes in Computer Science, vol 4394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_49
Download citation
DOI: https://doi.org/10.1007/978-3-540-70939-8_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70938-1
Online ISBN: 978-3-540-70939-8
eBook Packages: Computer ScienceComputer Science (R0)