Abstract
Many researches have been converging on automatic text summarization as increasing of text documents due to the expansion of information diffusion constantly. The objective of this proposal is to achieve the most reliable and substantial context or most relevant brief summary of the text in extractive manner. The extractive text summarization produces the short summary of a certain text which contains the most important information of original text by extracting the set of sentences from the original document. This paper proposes an improved extractive text summarization method for documents by enhancing the conventional lexical chain method to produce better relevant information of the text using three distinct features or characteristics of keyword in a text. The keyword of the document is labeled using our previous work, transition probability distribution generator model which can learn the characteristics of the keyword in a document, and generates their probability distribution upon each feature.
Similar content being viewed by others
References
Annapurna P. Patil SD, Syed AAA, Tanay A, Varun B (2014) Automatic text summarizer. In: Proceedings of 2014 international conference on advances in computing, communications and informatics ICACCI, pp 1530–1534
Asad A, Idris N, Rasim MA, Ramiz MR (2015) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput. doi:10.1007/s00500-015-1881-4
Barzilay R, Elhadad M (1997) Using lexical chains for text summarization. In: Proceedings of the 35th annual meeting of the association for computational linguistics and the 8th European chapter meeting of the association for computational linguistics, workshop on intelligent scalable text summarization, pp 10–17
Cohen JD (1999) Highlights: language and domain-independent automatic indexing terms for abstracting. J Am Soc Inf Sci 46(3):162–174
Dipanjan D, Martins AFT (2007) A survey on automatic text summarization. Technical Report 8
Halliday M, Hasan R (1976) Cohesion in English. Longman, London
Harabagiu S, Moldovan D (1998) WordNet: an electronic lexical database. Chapter knowledge processing on an extended wordnet. MIT press, Cambridge
Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on emprical methods in natural language processing EMNLP ’03. Association for Computational Linguistics, pp 216–223
Ibrahim OAS, Landa-Silva D (2016) Term frequency with average term occurrences for textual information retrieval. Soft Comput 20:3045
Karen SJ (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21. doi:10.1108/eb026526
Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th annual international conference on systems documentation, ACM Press, pp 24–26
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on text summarization branches out WAS2004, pp 74–81
Lynn HM, Choi C, Choi JH, Shin J, Pankoo K (2016) The method of semi-supervised automatic keyword extraction for web documents using transition probability distribution generator. In: Proceedings of the international conference on research in adaptive and convergent systems RACS ’16, pp 1–6. doi:10.1145/2987386.2987399
Mani I (2001) Automatic summarization. Natural language processing 3. John Benjamins Publishing Company, Amsterdam, Philadelphia. doi:10.1075/nlp.3
Mani I, Maybury M (1999) Advances in automatic text summarization. Comput Linguist 26(2):280–281
Martin D, Karel J (2011) Automatic keyphrase extraction based on NLP and statistical methods. In: Proceedings of the Dateso 2011: annual international workshop on databases, texts, specifications and objects, CEUR workshop proceedings 706:140–145
Michael JG (2005) A comparative analysis of keyword extraction techniques. The State University of New Jersey, Rutgers
Morris J, Hirst G (1991) Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput Linguist 17(l):21–48
Rada M, Paul T (2004) TextRank: bringing order into texts. In: Proceedings of the conference on empirical methods in natural language processing EMNLP ’04. Association for Computational Linguistics, pp 404–411
Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. In: Berry MW, Kogan J (eds) Text mining: theory and applications. John Wiley, Chichester, UK. doi:10.1002/9780470689646.ch1
Zhang K, Xu H, Tang J, Li JZ (2006) Keyword extraction using support vector machine. In: Proceedings of the 7th international conference on web-age information management WAIM ’06. pp 85–96. doi:10.1007/11775300_8
Acknowledgements
This study was supported by research Fund from Chosun University, 2015.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Lynn, H.M., Choi, C. & Kim, P. An improved method of automatic text summarization for web contents using lexical chain with semantic-related terms. Soft Comput 22, 4013–4023 (2018). https://doi.org/10.1007/s00500-017-2612-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2612-9