Abstract
This paper describes a summarization engine developed primarily for the Czech language. Therefore, the engine takes advantage of language-dependent preprocessing modules performing segmentation of the input document into sentences, lemmatization and substitution of synonyms. Our system is also implemented as a dynamic library which can be employed in either a web or a desktop application, and supports a variety of summarization methods. To evaluate the performance of the system, several experiments are conducted in this paper using a set of manually created summaries. The obtained results show that our engine yields an outcome for Czech which is better or at least comparable to other online summarization systems. The above-mentioned reference summaries and the presented summarization engine are available online at http://summec.ite.tul.cz .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 159–165 (1958)
Jing, H., McKeown, K.R.: Cut and paste based text summarization. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, NAACL 2000, pp. 178–185. Association for Computational Linguistics, Stroudsburg (2000)
Fujii, Y., Kitaoka, N., Nakagawa, S., Nakagawa, S.: Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization. In: INTERSPEECH, pp. 2801–2804 (2007)
Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T., Sundheim, B.: Summac: a text summarization evaluation. Nat. Lang. Eng. 8, 43–68 (2002)
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press (1999)
Skorkovská, L.: Application of lemmatization and summarization methods in topic identification module for large scale language modeling data filtering. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 191–198. Springer, Heidelberg (2012)
Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manage. 43, 1606–1618 (2007)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2001)
Berry, M., Browne, M.: Understanding Search Engines. Society for Industrial and Applied Mathematics, Philadephia (2005)
Steinberger, J., Ježek, K.: Text summarization and singular value decomposition. In: Yakhno, T. (ed.) ADVIS 2004. LNCS, vol. 3261, pp. 245–254. Springer, Heidelberg (2004)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Proceedings ACL Workshop on Text Summarization Branches Out (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rott, M., Červa, P. (2013). SummEC: A Summarization Engine for Czech. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-40585-3_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)