Skip to main content
Log in

An improved method of automatic text summarization for web contents using lexical chain with semantic-related terms

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Many researches have been converging on automatic text summarization as increasing of text documents due to the expansion of information diffusion constantly. The objective of this proposal is to achieve the most reliable and substantial context or most relevant brief summary of the text in extractive manner. The extractive text summarization produces the short summary of a certain text which contains the most important information of original text by extracting the set of sentences from the original document. This paper proposes an improved extractive text summarization method for documents by enhancing the conventional lexical chain method to produce better relevant information of the text using three distinct features or characteristics of keyword in a text. The keyword of the document is labeled using our previous work, transition probability distribution generator model which can learn the characteristics of the keyword in a document, and generates their probability distribution upon each feature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Annapurna P. Patil SD, Syed AAA, Tanay A, Varun B (2014) Automatic text summarizer. In: Proceedings of 2014 international conference on advances in computing, communications and informatics ICACCI, pp 1530–1534

  • Asad A, Idris N, Rasim MA, Ramiz MR (2015) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput. doi:10.1007/s00500-015-1881-4

    Google Scholar 

  • Barzilay R, Elhadad M (1997) Using lexical chains for text summarization. In: Proceedings of the 35th annual meeting of the association for computational linguistics and the 8th European chapter meeting of the association for computational linguistics, workshop on intelligent scalable text summarization, pp 10–17

  • Cohen JD (1999) Highlights: language and domain-independent automatic indexing terms for abstracting. J Am Soc Inf Sci 46(3):162–174

  • Dipanjan D, Martins AFT (2007) A survey on automatic text summarization. Technical Report 8

  • Halliday M, Hasan R (1976) Cohesion in English. Longman, London

    Google Scholar 

  • Harabagiu S, Moldovan D (1998) WordNet: an electronic lexical database. Chapter knowledge processing on an extended wordnet. MIT press, Cambridge

    Google Scholar 

  • Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on emprical methods in natural language processing EMNLP ’03. Association for Computational Linguistics, pp 216–223

  • Ibrahim OAS, Landa-Silva D (2016) Term frequency with average term occurrences for textual information retrieval. Soft Comput 20:3045

    Article  Google Scholar 

  • Karen SJ (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21. doi:10.1108/eb026526

  • Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th annual international conference on systems documentation, ACM Press, pp 24–26

  • Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on text summarization branches out WAS2004, pp 74–81

  • Lynn HM, Choi C, Choi JH, Shin J, Pankoo K (2016) The method of semi-supervised automatic keyword extraction for web documents using transition probability distribution generator. In: Proceedings of the international conference on research in adaptive and convergent systems RACS ’16, pp 1–6. doi:10.1145/2987386.2987399

  • Mani I (2001) Automatic summarization. Natural language processing 3. John Benjamins Publishing Company, Amsterdam, Philadelphia. doi:10.1075/nlp.3

    Book  Google Scholar 

  • Mani I, Maybury M (1999) Advances in automatic text summarization. Comput Linguist 26(2):280–281

    Google Scholar 

  • Martin D, Karel J (2011) Automatic keyphrase extraction based on NLP and statistical methods. In: Proceedings of the Dateso 2011: annual international workshop on databases, texts, specifications and objects, CEUR workshop proceedings 706:140–145

  • Michael JG (2005) A comparative analysis of keyword extraction techniques. The State University of New Jersey, Rutgers

    Google Scholar 

  • Morris J, Hirst G (1991) Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput Linguist 17(l):21–48

    Google Scholar 

  • Rada M, Paul T (2004) TextRank: bringing order into texts. In: Proceedings of the conference on empirical methods in natural language processing EMNLP ’04. Association for Computational Linguistics, pp 404–411

  • Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. In: Berry MW, Kogan J (eds) Text mining: theory and applications. John Wiley, Chichester, UK. doi:10.1002/9780470689646.ch1

  • Zhang K, Xu H, Tang J, Li JZ (2006) Keyword extraction using support vector machine. In: Proceedings of the 7th international conference on web-age information management WAIM ’06. pp 85–96. doi:10.1007/11775300_8

Download references

Acknowledgements

This study was supported by research Fund from Chosun University, 2015.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pankoo Kim.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lynn, H.M., Choi, C. & Kim, P. An improved method of automatic text summarization for web contents using lexical chain with semantic-related terms. Soft Comput 22, 4013–4023 (2018). https://doi.org/10.1007/s00500-017-2612-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2612-9

Keywords

Navigation