Automatic Summary Generation from Single Document Using Information Gain

Prakash, Chandra; Shukla, Anupam

doi:10.1007/978-3-642-14834-7_15

Automatic Summary Generation from Single Document Using Information Gain

Chandra Prakash⁹ &
Anupam Shukla⁹

Conference paper

1154 Accesses
4 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 94))

Abstract

With tons of information pouring in every day over Internet, it is not easy to read each and every document. The information retrieval from search engine is still far greater than that a user can handle and manage. So there is need of presenting the information in a summarized way. In this paper, an automatic abstractive summarization technique from single document is proposed. The sentences in the text are identified first. Then from those sentences segments, unique terms are identified. A Term-Sentence matrix is generated, where the column represents the sentences and the row represents the terms. The entries in the matrix are weight from information gain. Column with a maximum cosine similarity is selected as first sentence of the summary sentence and likewise. Results over documents indicate that the performance of the proposed approach compares very favorably with other approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Verma, R., Chen, P.: Integrating Ontology Knowledge into a Query-based Information Summarization System. In: DUC 2007, Rochester, NY (2007)
Google Scholar
Zhong, M.-S., Liu, L., Lu, R.-Z.: Shallow Parsing Based on Maximum Matching Method and Scoring Model
Google Scholar
Lunh, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2, 159–165 (1958)
Article Google Scholar
Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the ACM (JACM) 16(2), 264–285 (1969)
Article MATH Google Scholar
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval Information Processing & Management 24, 513–523 (1988)
Google Scholar
Luhn, H.P.: A Statical Approach to Mechanical Encoding and Searching of Literary Information. IBM Journal of Research and Development, 309–317 (1975)
Google Scholar
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management 24, 513–523 (1988)
Article Google Scholar
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of SIGIR (1995)
Google Scholar
Conroy, J.M., O’leary, D.P.: Text summarization via hidden markov model. In: Proceedings of SIGIR 2001, New York, NY, USA, pp. 406–407 (2001)
Google Scholar
Agarwal, N., Ford, K.H., Shneider, M.: Sentence Boundary Detection using a MaxEnt Classifer
Google Scholar
García-Hernández, R.A., Ledeneva, Y.: Word Sequence Models for Single Text Summarization. In: 2009 Second International Conferences on Advances in Computer-Human Interactions, pp. 44–48 (2009)
Google Scholar
The Hindu, http://www.hinduonnet.com/ (accessed on June 23, 2009)
Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Dept. of Computer Science, University of Glasgow (1979)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Information Technology and Management, Gwalior, Madhya Pradesh, 474010, India
Chandra Prakash & Anupam Shukla

Authors

Chandra Prakash
View author publications
You can also search for this author in PubMed Google Scholar
Anupam Shukla
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Sciences, University of Florida, 32611, Gainesville, FL, USA
Sanjay Ranka
University of Florida, Gainesville, Fl, USA
Arunava Banerjee
Department of Computer Science and Engineering, Indian Institute of Technology, 110016, New Delhi, INDIA
Kanad Kishore Biswas
Computer Science, College of Engineering and Science, Louisiana Tech University, LA 71272, Ruston, USA
Sumeet Dua
University of Florida, Gainesville, FL, USA
Prabhat Mishra
Department of Computer Science & Engineering, Indian Institute of Technology, 208016, Kanpur, India
Rajat Moona
National Tsing Hua University, Hsin-Chu, Taiwan, R.O.C.
Sheung-Hung Poon
Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong
Cho-Li Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prakash, C., Shukla, A. (2010). Automatic Summary Generation from Single Document Using Information Gain. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14834-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-14834-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14833-0
Online ISBN: 978-3-642-14834-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics