Abstract
This paper describes an automatic summarization approach that constructs a summary by extracting the significant sentences. The approach takes advantage of the cooccurrence relationships between terms only in the document. The techniques used are principal component analysis (PCA) to extract the significant terms and singular value decompostion (SVD) to find out the significant sentences. The PCA can quantify both the term frequency and term-term relationship in the document by the eigenvalue-eigenvector pairs. And the sentence-term matrix can be decomposed into the proper dimensional sentence-concentrated and term-concentrated marices which are used for the Euclidean distances between the sentence and term vectors and also removed the noise of variability in term usage by the SVD. Experimental results on Korean newspaper articles show that the proposed method is to be preferred over random selection of sentences or only PCA when summarization is the goal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)
Barzilay, R., Elhadad, M.: Using Lexical chains for Text Summarization. In: Mani, I., Maybury, M.T. (eds.) Advances in automatic text summarization, pp. 111–121. The MIT Press, Cambridge (1999)
Deerwester, S., Dumais, S.T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 381–407 (1990)
Edmundson, H.P.: New Methods in Automatic Extracting. In: Mani, I., Maybury, M.T. (eds.) Advances in automatic text summarization, pp. 23–42. The MIT Press, Cambridge (1999)
Haykin, S.S.: Neural networks: A comprehensive foundation, 2nd edn. Prentice Hall PTR, Paramus (1998)
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 3rd edn. Prentice Hall, NJ (1992)
Lee, C.B., Kim, M.S., Park, H.-R.: Automatic summarization based on principal component analysis. In: Pires, F.M., Abreu, S.P. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 409–413. Springer, Heidelberg (2003)
Luhn, H.P.: The Automatic Creation of Literature Abstracts. In: Mani, I., Maybury, M.T. (eds.) Advances in automatic text summarization, pp. 15–21. The MIT Press, Cambridge (1999)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes in C++, 2nd edn. Cambridge University Press, New York (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, C., Park, H., Ock, C. (2005). Significant Sentence Extraction by Euclidean Distance Based on Singular Value Decomposition. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_56
Download citation
DOI: https://doi.org/10.1007/11562214_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29172-5
Online ISBN: 978-3-540-31724-1
eBook Packages: Computer ScienceComputer Science (R0)