Skip to main content

Automatic Summary Generation from Single Document Using Information Gain

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 94))

Abstract

With tons of information pouring in every day over Internet, it is not easy to read each and every document. The information retrieval from search engine is still far greater than that a user can handle and manage. So there is need of presenting the information in a summarized way. In this paper, an automatic abstractive summarization technique from single document is proposed. The sentences in the text are identified first. Then from those sentences segments, unique terms are identified. A Term-Sentence matrix is generated, where the column represents the sentences and the row represents the terms. The entries in the matrix are weight from information gain. Column with a maximum cosine similarity is selected as first sentence of the summary sentence and likewise. Results over documents indicate that the performance of the proposed approach compares very favorably with other approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Verma, R., Chen, P.: Integrating Ontology Knowledge into a Query-based Information Summarization System. In: DUC 2007, Rochester, NY (2007)

    Google Scholar 

  2. Zhong, M.-S., Liu, L., Lu, R.-Z.: Shallow Parsing Based on Maximum Matching Method and Scoring Model

    Google Scholar 

  3. Lunh, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2, 159–165 (1958)

    Article  Google Scholar 

  4. Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the ACM (JACM) 16(2), 264–285 (1969)

    Article  MATH  Google Scholar 

  5. Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval Information Processing & Management 24, 513–523 (1988)

    Google Scholar 

  6. Luhn, H.P.: A Statical Approach to Mechanical Encoding and Searching of Literary Information. IBM Journal of Research and Development, 309–317 (1975)

    Google Scholar 

  7. Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management 24, 513–523 (1988)

    Article  Google Scholar 

  8. Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of SIGIR (1995)

    Google Scholar 

  9. Conroy, J.M., O’leary, D.P.: Text summarization via hidden markov model. In: Proceedings of SIGIR 2001, New York, NY, USA, pp. 406–407 (2001)

    Google Scholar 

  10. Agarwal, N., Ford, K.H., Shneider, M.: Sentence Boundary Detection using a MaxEnt Classifer

    Google Scholar 

  11. García-Hernández, R.A., Ledeneva, Y.: Word Sequence Models for Single Text Summarization. In: 2009 Second International Conferences on Advances in Computer-Human Interactions, pp. 44–48 (2009)

    Google Scholar 

  12. The Hindu, http://www.hinduonnet.com/ (accessed on June 23, 2009)

  13. Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Dept. of Computer Science, University of Glasgow (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Prakash, C., Shukla, A. (2010). Automatic Summary Generation from Single Document Using Information Gain. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14834-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14834-7_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14833-0

  • Online ISBN: 978-3-642-14834-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics