Abstract
With the rise in the amount of textual data over the internet, the demand for summarizing it in a short, readable, easy-to-understand form has increased. Much of the research is being carried out to improve the efficiency of these text summarization systems. In the past, extractive text summarization was mainly carried out through human-crafted features which were unable to learn the semantic information from the text. Therefore, in an attempt to improve the quality of summary, we have designed a neural network-based completely data-driven model for extractive single-document summarization of text which we have termed as WL-AttenSumm. Our proposed model implements a Word-level Attention mechanism that focuses more on the important parts in the input sequence so relevant semantic features are captured at the word-level that helps in selecting significant sentences for the summary. Another advantage of this model is that it can extract syntactic and semantic relationships from the text by using a Convolutional Bi-GRU (Bi-directional Gated Recurrent Unit) network. We have trained our proposed model on the combined CNN/Daily Mail corpus and evaluated on the Daily Mail, combined CNN/Daily Mail, and DUC 2002 test dataset for single document summarization and obtained better results as compared to the state-of-the-art baseline approaches in terms of ROUGE metrics. For the summary length limited to 75 words, our attention-based approach generates ROUGE recall scores for R-1, R-2, R-L measures as 32.8%, 11.0%, 27.5% with Daily Mail corpus and 55.9%, 24.8%, 53.9% with DUC 2002 dataset, respectively. Experiments performed with the joint CNN/Daily dataset yield full-length ROUGE F1 scores as 42.9%, 19.7%, 39.3%. Therefore, our deep learning-based summarization framework achieves competitive performance.

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alguliyev RM, Aliguliyev RM, Isazade NR, Abdi A, Idris N (2019) COSUM: text summarization based on clustering and optimization. Expert Syst 36(1):e12340. https://doi.org/10.1111/exsy.12340
Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In 3rd international conference on learning representations (ICLR 2015), California
Bansal N, Sharma A, Singh RK (2019) Fuzzy AHP approach for legal judgement summarization. J Manag Anal 6(3):323–340
Baralis E, Cagliero L, Mahoto N, Fiori A (2013) GraphSum: discovering correlations among multiple terms for graph-based summarization. Inf Sci 249:96–109. https://doi.org/10.1016/j.ins.2013.06.046
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166. https://doi.org/10.1109/72.279181
Bi K, Jha R, Croft WB, Celikyilmaz A (2020) AREDSUM: adaptive redundancy-aware iterative sentence ranking for extractive document summarization. arXiv preprint arXiv:2004.06176
Cao Z, Li W, Li S, Wei F, Li Y (2016) Attsum: joint learning of focusing and summarization with neural attention. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp 547–556
Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization. In Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), ACL, Beijing, pp 829–833. https://doi.org/10.3115/v1/P15-2136
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based Reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, ACM, Melbourne, pp 335–336
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: Long papers), ACL, Berlin, pp 484-494. https://doi.org/10.18653/v1/P16-1046
Cho K, van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. In Proceedings of SSST-8, eighth workshop on syntax, Semantics and Structure in Statistical Translation, ACL, Doha, pp 103–111
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP), ACL, Doha, Qatar, pp 1724–1734. https://doi.org/10.3115/v1/D14-1179
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Diao Y, Lin H, Yang L, Fan X, Chu Y, Wu D, Zhang D, Xu K (2020) CRHASum: extractive text summarization with contextualized-representation hierarchical-attention summarization network. Neural Comput Applic 32(15):11491–11503
Dong Y, Shen Y, Crawford E, van Hoof H, Cheung JC (2018) Banditsum: extractive summarization as a contextual bandit. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, ACL, Brussels, Belgium, pp 3739–3748. https://doi.org/10.18653/v1/D18-1409
Du J, Xu R, He Y, Gui L (2017) Stance classification with target-specific neural attention networks. In proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI-17), Melbourne, pp 3988-3994
Edmundson HP (1969) New methods in automatic extracting. J ACM (JACM) 16(2):264–285. https://doi.org/10.1145/321510.321519
El-Kassas WS, Salama CR, Rafea AA, Mohamed HK (2020) EdgeSumm: graph-based framework for automatic text summarization. Inf Process Manag 57(6):102264. https://doi.org/10.1016/j.ipm.2020.102264
Erkan G, Radev DR (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Fan A, Lewis M, Dauphin Y (2018) Hierarchical neural story generation. In Proceedings of the 56th annual meeting of the Association for Computational Linguistics (volume 1: Long papers), ACL, Melbourne, pp 889–898
Fang C, Mu D, Deng Z, Wu Z (2017) Word-sentence co-ranking for automatic extractive text summarization. Expert Syst Appl 72:189–195
Fattah MA, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23(1):126–144. https://doi.org/10.1016/j.csl.2008.04.002
Feng C, Cai F, Chen H, Rijke M (2018) Attentive encoder-based extractive text summarization. In proceedings of the 27th ACM international conference on information and knowledge management (CIKM’18), New York, pp 1499–1502
Gambhir M, Gupta V (2016) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47(1):1–66. https://doi.org/10.1007/s10462-016-9475-9
Gui L, Hu J, He Y, Xu R, Lu Q, Du J (2017) A question answering approach to emotion cause extraction. In Proceedings of the 2017 Conference on empirical methods in natural language processing, ACL, Denmark, pp 1593–1602
Gupta V, Kaur N (2015) A novel hybrid text summarization system for Punjabi text. Cogn Comput 8(2):261–277
Hark C, Karcı A (2020) Karcı summarization: a simple and effective approach for automatic text summarization using Karcı entropy. Inf Process Manag 57(3):102187. https://doi.org/10.1016/j.ipm.2019.102187
Hermann KM, Kočiský T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In Proceedings of the 28th international conference on neural information processing systems - volume 1 (NIPS'15), MIT Press, Cambridge, MA, USA, pp 1693–1701
Kim Y (2014) Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Qatar, pp 1746–1751
Kim J, Kong D, Lee JH (2018) Self-attention-based message-relevant response generation for neural conversation model. arXiv preprint arXiv:1805.08983
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In 3rd international conference on learning representations (ICLR 2015), California
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In Text summarization branches out, ACL, Barcelona, pp 74–81
Liu Y, Titov I, Lapata M (2019) Single document summarization as tree induction. In proceedings of the 2019 conference of the north American chapter of the Association for Computational Linguistics: human language technologies, volume 1 (Long and short papers), Minneapolis, Minnesota, pp 1745–1755. https://doi.org/10.18653/v1/N19-1173
Lu J, Yang J, Batra D, Parikh D (2016) Hierarchical question-image co-attention for visual question answering. In proceedings of the 30th international conference on neural information processing systems (NIPS'16), Barcelona, pp 289–297
Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165. https://doi.org/10.1147/rd.22.0159
Luo L, Ao X, Song Y, Pan F, Yang M, He Q (2019) Reading like HER: human reading inspired extractive summarization. In proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Hong Kong, China, pp 3024–3034. https://doi.org/10.18653/v1/D19-1300
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on empirical methods in natural language processing (EMNLP), Lisbon, Portugal, pp 1412–1421
McDonald R (2007) A study of global inference algorithms in multi-document summarization. In European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp 557–564
Meena YK, Gopalani D (2016) Efficient voting-based extractive automatic text summarization using prominent feature set. IETE J Res 62(5):581–590
Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manag 54(2):145–158. https://doi.org/10.1016/j.ipm.2017.11.002
Mendoza M, Bonilla S, Noguera C, Cobos C, León E (2014) Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 41(9):4158–4169
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th international conference on neural information processing systems - volume 2 (NIPS'13), Curran Associates Inc., New York, pp 3111–3119
Mohamed M, Oussalah M (2019) SRL-ESA-TextSum: a text summarization approach based on semantic role labeling and explicit semantic analysis. Inf Process Manag 56(4):1356–1372. https://doi.org/10.1016/j.ipm.2019.04.003
Nallapati R, Zhai F, Zhou B (2017) Summarunner: a recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the thirty-first AAAI conference on artificial intelligence (AAAI'17), AAAI Press, California, pp 3075–3081
Nallapati R, Zhou B, Ma M (2016) Classify or select: neural architectures for extractive document summarization. CoRR-A Computing Research Repository, arXiv preprint arXiv:1611.04244
Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), ACL, New Orleans, Louisiana, pp 1747–1759. https://doi.org/10.18653/v1/N18-1158
Nenkova A, McKeown K (2011) Automatic summarization. Foundations and Trends® in Information Retrieval 5(2–3):103–233. https://doi.org/10.1561/1500000015
Nenkova A, Vanderwende L (2005) The impact of frequency on summarization. Microsoft Research, Redmond, Washington, vol 101
Parveen D, Mesgar M, Strube M (2016) Generating coherent summaries of scientific articles using coherence patterns. In proceedings of the 2016 conference on empirical methods in natural language processing, ACL, Austin, Texas, pp 772–783. https://doi.org/10.18653/v1/D16-1074
Parveen D, Ramsl HM, Strube M (2015) Topical coherence for graph-based extractive summarization. In Proceedings of the 2015 Conference on empirical methods in natural language processing (EMNLP), ACL, Lisbon, Portugal, pp 1949–1954
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Qatar, pp 1532–1543
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Saini N, Saha S, Jangra A, Bhattacharyya P (2019) Extractive single document summarization using multi-objective optimization: exploring self-organized differential evolution, grey wolf optimizer and water cycle algorithm. Knowl-Based Syst 164:45–67. https://doi.org/10.1016/j.knosys.2018.10.021
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Shao L, Zhang H, Wang J (2017) Robust single-document summarizations and a semantic measurement of quality. In international joint conference on knowledge discovery, knowledge engineering, and knowledge management, springer, pp 118–138. https://doi.org/10.1007/978-3-030-15640-4_7
Shen T, Zhou T, Long G, Jiang J, Pan S, Zhang C (2018) DiSAN: directional self-attention network for RNN/CNN-free language understanding. In The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 32(1) 5446–5455
Singh AK, Gupta M, Varma V (2017) Hybrid MemNet for extractive summarization. In Proceedings of the 2017 ACM on conference on information and knowledge management (CIKM’17), Singapore, pp 2303–2306
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In proceedings of the 27th international conference on neural information processing systems-volume 2 (NIPS'14), Montreal, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In 31st conference on neural information processing systems (NIPS 2017), California, pp 5998–6008
Wan X (2010) Towards a unified approach to simultaneous single-document and multi-document summarizations. In Proceedings of the 23rd international conference on computational linguistics (Coling 2010), ACL, Beijing, pp 1137–1145
Wang D, Liu P, Zheng Y, Qiu X, Huang X (2020) Heterogeneous graph neural networks for extractive document summarization. In proceedings of the 58th annual meeting of the Association for Computational Linguistics, pp 6209–6219
Woodsend K, Lapata M (2010) Automatic generation of story highlights. In proceedings of the 48th annual meeting of the Association for Computational Linguistics, Sweden, pp 565–574
Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In proceedings of the 32nd international conference on international conference on machine learning - volume 37 (ICML'15), France, pp 2048–2057
Xu J, Durrett G (2019) Neural extractive text summarization with syntactic compression. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, ACL, Hong Kong, China, pp 3292–3303
Yang K, He H, Al-Sabahi K, Zhang Z (2019) EcForest: extractive document summarization through enhanced sentence embedding and cascade forest. Concurr Comput Pract Exp 31(17):e5206. https://doi.org/10.1002/cpe.5206
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the north American chapter of the Association for Computational Linguistics: human language technologies, California, pp 1480–1489
Yao J, Wan X, Xiao J (2017) Recent advances in document summarization. Knowl Inf Syst 53(2):297–336. https://doi.org/10.1007/s10115-017-1042-4
Yao K, Zhang L, Luo T, Wu Y (2018) Deep reinforcement learning for extractive document summarization. Neurocomputing 284:52–62
Yasunaga M, Zhang R, Meelu K, Pareek A, Srinivasan K, Radev D (2017) Graph-based neural multi-document summarization. In proceedings of the 21st conference on computational natural language learning (CoNLL 2017), ACL, Vancouver, Canada, pp 452–462. https://doi.org/10.18653/v1/K17-1045
Yin W, Pei Y (2015) Optimizing sentence modeling and selection for document summarization. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI'15), AAAI Press, Argentina, pp 1383–1389
Zhang X, Lapata M, Wei F, Zhou M (2018) Neural latent extractive document summarization. In proceedings of the 2018 conference on empirical methods in natural language processing, ACL, Brussels, Belgium, pp 779–784. https://doi.org/10.18653/v1/D18-1088
Zhong M, Liu P, Wang D, Qiu X, Huang X (2019) Searching for effective neural extractive summarization: what works and What's next. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp 1049–1058. https://doi.org/10.18653/v1/P19-1100
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gambhir, M., Gupta, V. Deep learning-based extractive text summarization with word-level attention mechanism. Multimed Tools Appl 81, 20829–20852 (2022). https://doi.org/10.1007/s11042-022-12729-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12729-y