Abstract
With tons of information pouring in every day over Internet, it is not easy to read each and every document. The information retrieval from search engine is still far greater than that a user can handle and manage. So there is need of presenting the information in a summarized way. In this paper, an automatic abstractive summarization technique from single document is proposed. The sentences in the text are identified first. Then from those sentences segments, unique terms are identified. A Term-Sentence matrix is generated, where the column represents the sentences and the row represents the terms. The entries in the matrix are weight from information gain. Column with a maximum cosine similarity is selected as first sentence of the summary sentence and likewise. Results over documents indicate that the performance of the proposed approach compares very favorably with other approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Verma, R., Chen, P.: Integrating Ontology Knowledge into a Query-based Information Summarization System. In: DUC 2007, Rochester, NY (2007)
Zhong, M.-S., Liu, L., Lu, R.-Z.: Shallow Parsing Based on Maximum Matching Method and Scoring Model
Lunh, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2, 159–165 (1958)
Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the ACM (JACM) 16(2), 264–285 (1969)
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval Information Processing & Management 24, 513–523 (1988)
Luhn, H.P.: A Statical Approach to Mechanical Encoding and Searching of Literary Information. IBM Journal of Research and Development, 309–317 (1975)
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management 24, 513–523 (1988)
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of SIGIR (1995)
Conroy, J.M., O’leary, D.P.: Text summarization via hidden markov model. In: Proceedings of SIGIR 2001, New York, NY, USA, pp. 406–407 (2001)
Agarwal, N., Ford, K.H., Shneider, M.: Sentence Boundary Detection using a MaxEnt Classifer
García-Hernández, R.A., Ledeneva, Y.: Word Sequence Models for Single Text Summarization. In: 2009 Second International Conferences on Advances in Computer-Human Interactions, pp. 44–48 (2009)
The Hindu, http://www.hinduonnet.com/ (accessed on June 23, 2009)
Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Dept. of Computer Science, University of Glasgow (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Prakash, C., Shukla, A. (2010). Automatic Summary Generation from Single Document Using Information Gain. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14834-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-14834-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14833-0
Online ISBN: 978-3-642-14834-7
eBook Packages: Computer ScienceComputer Science (R0)