Abstract
An automatic document summarization is one of the essential techniques to display on small devices such as mobile phones and other handheld devices. Most researches in automatic document summarization have focused on extraction of sentences. Sentences extracted as a summary are so long that even a summary is not easy to be displayed in a small device. Therefore, compressing sentences is practically helpful for displaying in a small device. In this paper, we present a pilot system that can automatically compress a Korean sentence using the knowledge extracted from news articles and their headlines. A compressed sentence generated by our system resembles a headline of news articles, so it can be one of the briefest forms preserving the core meaning of an original sentence. Our compressing system has shown to be promising through a preliminary experiment.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Balazer, J.: Sentence Compression Using a Machine Learning Technique (2004), http://www.eecs.umich.edu/balazer/sc/
Carroll, J., Minnen, G., Canning, Y., Devlin, S., Tait, J.: Practical simplification of English newspaper text to assist aphasic readers. In: Proceedings of the AAAI 1998 Workshop on Integrating AI and Assistive Technology (1998)
Chung, H., Rim, H.-C.: A new probabilistic dependency parsing model for head-final free word-order languages. IEICE Trans. on Information and Systems E86-D(11) (2003)
Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. Ph.D. Thesis. Department of Computer and Information Science, University of Pennsylvania (1999)
Hovy, E., Lin, C.-Y.: Automated text summarization in SUMMARIST system. In: Mani, I., Maybury, M.T. (eds.) Advances in Automatic Text Summarization, pp. 81–94. MIT Press, Cambridge (1999)
Knight, K., Marcu, D.: Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91–107 (2002)
Lee, K.J., Kim, J.-H., Han, Y.S., Kim, G.C.: Restricted representation of phrase structure grammar for building a tree annotated corpus of Korean. Natural Language Engineering 3-2&3, 215–230 (1997)
Lin, C.-Y.: Improving summarization performance by sentence compression – A pilot study. In: Proceedings of IRAL 2003 (2003)
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)
Myaeng, S.H., Jang, D., Song, S., Kim, J., Lee, S., Lee, J., Lee, E., Seo, J.: Construction of an information retrieval test collection and its validation. In: Proceedings of the Conference on Hangul and Korean Language Information Processing, pp. 20–27 (1999) (in Korean)
Vandeghinste, V., Tjong Kim Sang, E.: Using a parallel transcript/subtitle corpus for sentence compression. In: Proceedings of LREC2004 ELRA, Paris (2004)
Wasson, M.: Using leading text for news summaries: Evaluation results and implications for commercial summarization applications. In: Proceedings of COLING-ACL 1998, pp. 1364–1368 (1998)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Yang, C.C., Wang, F.L.: Fractal summarization: summarization based on fractal theory. In: Proceedings of SIGIR, pp. 391–392 (2003)
Yoshihiro, U., Mamiko, O., Takahiro, K., Tadanobu, M.: Toward the at-a-glance summary: Phrase-representation summarization method. In: Proceedings of the International Conference on Computational Linguistics, pp. 878–884 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, K.J., Kim, JH. (2005). Sentence Compression Learned by News Headline for Displaying in Small Device. In: Myaeng, S.H., Zhou, M., Wong, KF., Zhang, HJ. (eds) Information Retrieval Technology. AIRS 2004. Lecture Notes in Computer Science, vol 3411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31871-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-31871-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25065-4
Online ISBN: 978-3-540-31871-2
eBook Packages: Computer ScienceComputer Science (R0)