Abstract
The methods using supervised algorithms for generic document summarization are time-consuming because they need a set training data and associated summaries. We propose a new unsupervised method using the Non-negative Semantic Variable to select the sentences for automatic generic document summarization. The proposed method selects meaningful sentences for generic document summarization. Besides, it can improve the quality of generic summaries because the extracted sentences are well covered with the major topics of document. And also it doesn’t need a set training data because it is an unsupervised method. The experimental results demonstrate that the proposed method achieves better performance the other method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Amini, M.R., Gallinari, P.: The Use of Unlabeled Data to Improve Supervised Learning for Text Summarization. In: Proceeding of ACM SIGIR 2002, pp. 105–112 (2002)
Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann, San Francisco (2003)
Chuang, W.T., Yang, J.: Extracting Sentence Segments for Text Summarization: A Machine Learning Approach. In: Proceeding of ACM SIGIR 2000, pp. 152–159 (2000)
Frankes, W.B., Baeza-Yaes, R.: Information Retrieval: Data Structure & Algorithms. Prentice-Hall, Englewood Cliffs (1992)
Gong, Y., Liu, X.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In: Proceeding of ACM SIGIR 2001, pp. 19–25 (2001)
Harabagiu, S., Finley, L.: Topic Themes for Multi-Document Summarization. In: Proceeding of ACM SIGIR 2005, pp. 202–209 (2005)
Kang, S.S.: Information Retrieval and Morpheme Analysis. HongReung Science Publishing Co. (2002)
Lee, D.D., Seung, H.S.: Learning the Parts of Objects by Non-negative Matrix Factorization. Nature 401, 788–791 (1999)
Lee, D.D., Seung, H.S.: Algorithms for Non-negative Matrix Factorization. Advances in Neural Information Processing Systems 13, 556–562 (2001)
Marcu, D.: The Automatic Construction of Large-scale Corpora for Summarization Research. In: Proceeding of ACM SIGIR 1999, pp. 137–144 (1999)
Mani, I., Maybury, M.T.: Advances in Automatic Text. The MIT Press, Cambridge (1999)
Nomoto, T., Matsumoto, Y.J.: A New Approach to Unsupervised Text Summarization. In: Proceeding of ACM SIGIR 2001, pp. 26–34 (2001)
Sum, J.T., Shen, D., Zeng, H.J., Yang, Q., Lu, Y., Chen, Z.: Web-Page Summarization Using Clickthrough Data. In: Proceeding of ACM SIGIR 2005, pp. 194–201 (2005)
Xu, W., Liu, X., Gong, Y.: Document Clustering Based on Non-negative Matrix Factorization. In: ACM SIGIR, Toronto, Canada (2003)
Zha, H.Y.: Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering. In: Proceeding of ACM SIGIR 2002, pp. 113–120 (2002)
http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html (2008)
http://kr.news.yahoo.com/ (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Park, S. (2008). Generic Summarization Using Non-negative Semantic Variable. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2008. Lecture Notes in Computer Science, vol 5226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87442-3_126
Download citation
DOI: https://doi.org/10.1007/978-3-540-87442-3_126
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87440-9
Online ISBN: 978-3-540-87442-3
eBook Packages: Computer ScienceComputer Science (R0)