Deep learning-based extractive text summarization with word-level attention mechanism

Gambhir, Mahak; Gupta, Vishal

doi:10.1007/s11042-022-12729-y

Deep learning-based extractive text summarization with word-level attention mechanism

Published: 12 March 2022

Volume 81, pages 20829–20852, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Mahak Gambhir¹ &
Vishal Gupta¹

1171 Accesses
23 Citations
1 Altmetric
Explore all metrics

Abstract

With the rise in the amount of textual data over the internet, the demand for summarizing it in a short, readable, easy-to-understand form has increased. Much of the research is being carried out to improve the efficiency of these text summarization systems. In the past, extractive text summarization was mainly carried out through human-crafted features which were unable to learn the semantic information from the text. Therefore, in an attempt to improve the quality of summary, we have designed a neural network-based completely data-driven model for extractive single-document summarization of text which we have termed as WL-AttenSumm. Our proposed model implements a Word-level Attention mechanism that focuses more on the important parts in the input sequence so relevant semantic features are captured at the word-level that helps in selecting significant sentences for the summary. Another advantage of this model is that it can extract syntactic and semantic relationships from the text by using a Convolutional Bi-GRU (Bi-directional Gated Recurrent Unit) network. We have trained our proposed model on the combined CNN/Daily Mail corpus and evaluated on the Daily Mail, combined CNN/Daily Mail, and DUC 2002 test dataset for single document summarization and obtained better results as compared to the state-of-the-art baseline approaches in terms of ROUGE metrics. For the summary length limited to 75 words, our attention-based approach generates ROUGE recall scores for R-1, R-2, R-L measures as 32.8%, 11.0%, 27.5% with Daily Mail corpus and 55.9%, 24.8%, 53.9% with DUC 2002 dataset, respectively. Experiments performed with the joint CNN/Daily dataset yield full-length ROUGE F1 scores as 42.9%, 19.7%, 39.3%. Therefore, our deep learning-based summarization framework achieves competitive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

Article 23 May 2023

Incorporating word attention with convolutional neural networks for abstractive summarization

Article 06 August 2019

Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism

Article 01 July 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Alguliyev RM, Aliguliyev RM, Isazade NR, Abdi A, Idris N (2019) COSUM: text summarization based on clustering and optimization. Expert Syst 36(1):e12340. https://doi.org/10.1111/exsy.12340
Article Google Scholar
Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In 3rd international conference on learning representations (ICLR 2015), California
Bansal N, Sharma A, Singh RK (2019) Fuzzy AHP approach for legal judgement summarization. J Manag Anal 6(3):323–340
Google Scholar
Baralis E, Cagliero L, Mahoto N, Fiori A (2013) GraphSum: discovering correlations among multiple terms for graph-based summarization. Inf Sci 249:96–109. https://doi.org/10.1016/j.ins.2013.06.046
Article MathSciNet Google Scholar
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166. https://doi.org/10.1109/72.279181
Article Google Scholar
Bi K, Jha R, Croft WB, Celikyilmaz A (2020) AREDSUM: adaptive redundancy-aware iterative sentence ranking for extractive document summarization. arXiv preprint arXiv:2004.06176
Cao Z, Li W, Li S, Wei F, Li Y (2016) Attsum: joint learning of focusing and summarization with neural attention. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp 547–556
Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization. In Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), ACL, Beijing, pp 829–833. https://doi.org/10.3115/v1/P15-2136
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based Reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, ACM, Melbourne, pp 335–336
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: Long papers), ACL, Berlin, pp 484-494. https://doi.org/10.18653/v1/P16-1046
Cho K, van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. In Proceedings of SSST-8, eighth workshop on syntax, Semantics and Structure in Statistical Translation, ACL, Doha, pp 103–111
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP), ACL, Doha, Qatar, pp 1724–1734. https://doi.org/10.3115/v1/D14-1179
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
Diao Y, Lin H, Yang L, Fan X, Chu Y, Wu D, Zhang D, Xu K (2020) CRHASum: extractive text summarization with contextualized-representation hierarchical-attention summarization network. Neural Comput Applic 32(15):11491–11503
Article Google Scholar
Dong Y, Shen Y, Crawford E, van Hoof H, Cheung JC (2018) Banditsum: extractive summarization as a contextual bandit. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, ACL, Brussels, Belgium, pp 3739–3748. https://doi.org/10.18653/v1/D18-1409
Du J, Xu R, He Y, Gui L (2017) Stance classification with target-specific neural attention networks. In proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI-17), Melbourne, pp 3988-3994
Edmundson HP (1969) New methods in automatic extracting. J ACM (JACM) 16(2):264–285. https://doi.org/10.1145/321510.321519
Article MATH Google Scholar
El-Kassas WS, Salama CR, Rafea AA, Mohamed HK (2020) EdgeSumm: graph-based framework for automatic text summarization. Inf Process Manag 57(6):102264. https://doi.org/10.1016/j.ipm.2020.102264
Article Google Scholar
Erkan G, Radev DR (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Article Google Scholar
Fan A, Lewis M, Dauphin Y (2018) Hierarchical neural story generation. In Proceedings of the 56th annual meeting of the Association for Computational Linguistics (volume 1: Long papers), ACL, Melbourne, pp 889–898
Fang C, Mu D, Deng Z, Wu Z (2017) Word-sentence co-ranking for automatic extractive text summarization. Expert Syst Appl 72:189–195
Article Google Scholar
Fattah MA, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23(1):126–144. https://doi.org/10.1016/j.csl.2008.04.002
Article Google Scholar
Feng C, Cai F, Chen H, Rijke M (2018) Attentive encoder-based extractive text summarization. In proceedings of the 27th ACM international conference on information and knowledge management (CIKM’18), New York, pp 1499–1502
Gambhir M, Gupta V (2016) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47(1):1–66. https://doi.org/10.1007/s10462-016-9475-9
Article Google Scholar
Gui L, Hu J, He Y, Xu R, Lu Q, Du J (2017) A question answering approach to emotion cause extraction. In Proceedings of the 2017 Conference on empirical methods in natural language processing, ACL, Denmark, pp 1593–1602
Gupta V, Kaur N (2015) A novel hybrid text summarization system for Punjabi text. Cogn Comput 8(2):261–277
Article Google Scholar
Hark C, Karcı A (2020) Karcı summarization: a simple and effective approach for automatic text summarization using Karcı entropy. Inf Process Manag 57(3):102187. https://doi.org/10.1016/j.ipm.2019.102187
Article Google Scholar
Hermann KM, Kočiský T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In Proceedings of the 28th international conference on neural information processing systems - volume 1 (NIPS'15), MIT Press, Cambridge, MA, USA, pp 1693–1701
Kim Y (2014) Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Qatar, pp 1746–1751
Kim J, Kong D, Lee JH (2018) Self-attention-based message-relevant response generation for neural conversation model. arXiv preprint arXiv:1805.08983
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In 3rd international conference on learning representations (ICLR 2015), California
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In Text summarization branches out, ACL, Barcelona, pp 74–81
Liu Y, Titov I, Lapata M (2019) Single document summarization as tree induction. In proceedings of the 2019 conference of the north American chapter of the Association for Computational Linguistics: human language technologies, volume 1 (Long and short papers), Minneapolis, Minnesota, pp 1745–1755. https://doi.org/10.18653/v1/N19-1173
Lu J, Yang J, Batra D, Parikh D (2016) Hierarchical question-image co-attention for visual question answering. In proceedings of the 30th international conference on neural information processing systems (NIPS'16), Barcelona, pp 289–297
Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165. https://doi.org/10.1147/rd.22.0159
Article MathSciNet Google Scholar
Luo L, Ao X, Song Y, Pan F, Yang M, He Q (2019) Reading like HER: human reading inspired extractive summarization. In proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Hong Kong, China, pp 3024–3034. https://doi.org/10.18653/v1/D19-1300
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on empirical methods in natural language processing (EMNLP), Lisbon, Portugal, pp 1412–1421
McDonald R (2007) A study of global inference algorithms in multi-document summarization. In European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp 557–564
Meena YK, Gopalani D (2016) Efficient voting-based extractive automatic text summarization using prominent feature set. IETE J Res 62(5):581–590
Article Google Scholar
Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manag 54(2):145–158. https://doi.org/10.1016/j.ipm.2017.11.002
Article Google Scholar
Mendoza M, Bonilla S, Noguera C, Cobos C, León E (2014) Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 41(9):4158–4169
Article Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th international conference on neural information processing systems - volume 2 (NIPS'13), Curran Associates Inc., New York, pp 3111–3119
Mohamed M, Oussalah M (2019) SRL-ESA-TextSum: a text summarization approach based on semantic role labeling and explicit semantic analysis. Inf Process Manag 56(4):1356–1372. https://doi.org/10.1016/j.ipm.2019.04.003
Article Google Scholar
Nallapati R, Zhai F, Zhou B (2017) Summarunner: a recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the thirty-first AAAI conference on artificial intelligence (AAAI'17), AAAI Press, California, pp 3075–3081
Nallapati R, Zhou B, Ma M (2016) Classify or select: neural architectures for extractive document summarization. CoRR-A Computing Research Repository, arXiv preprint arXiv:1611.04244
Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), ACL, New Orleans, Louisiana, pp 1747–1759. https://doi.org/10.18653/v1/N18-1158
Nenkova A, McKeown K (2011) Automatic summarization. Foundations and Trends® in Information Retrieval 5(2–3):103–233. https://doi.org/10.1561/1500000015
Article Google Scholar
Nenkova A, Vanderwende L (2005) The impact of frequency on summarization. Microsoft Research, Redmond, Washington, vol 101
Parveen D, Mesgar M, Strube M (2016) Generating coherent summaries of scientific articles using coherence patterns. In proceedings of the 2016 conference on empirical methods in natural language processing, ACL, Austin, Texas, pp 772–783. https://doi.org/10.18653/v1/D16-1074
Parveen D, Ramsl HM, Strube M (2015) Topical coherence for graph-based extractive summarization. In Proceedings of the 2015 Conference on empirical methods in natural language processing (EMNLP), ACL, Lisbon, Portugal, pp 1949–1954
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Qatar, pp 1532–1543
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Article Google Scholar
Saini N, Saha S, Jangra A, Bhattacharyya P (2019) Extractive single document summarization using multi-objective optimization: exploring self-organized differential evolution, grey wolf optimizer and water cycle algorithm. Knowl-Based Syst 164:45–67. https://doi.org/10.1016/j.knosys.2018.10.021
Article Google Scholar
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
Shao L, Zhang H, Wang J (2017) Robust single-document summarizations and a semantic measurement of quality. In international joint conference on knowledge discovery, knowledge engineering, and knowledge management, springer, pp 118–138. https://doi.org/10.1007/978-3-030-15640-4_7
Shen T, Zhou T, Long G, Jiang J, Pan S, Zhang C (2018) DiSAN: directional self-attention network for RNN/CNN-free language understanding. In The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 32(1) 5446–5455
Singh AK, Gupta M, Varma V (2017) Hybrid MemNet for extractive summarization. In Proceedings of the 2017 ACM on conference on information and knowledge management (CIKM’17), Singapore, pp 2303–2306
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In proceedings of the 27th international conference on neural information processing systems-volume 2 (NIPS'14), Montreal, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In 31st conference on neural information processing systems (NIPS 2017), California, pp 5998–6008
Wan X (2010) Towards a unified approach to simultaneous single-document and multi-document summarizations. In Proceedings of the 23rd international conference on computational linguistics (Coling 2010), ACL, Beijing, pp 1137–1145
Wang D, Liu P, Zheng Y, Qiu X, Huang X (2020) Heterogeneous graph neural networks for extractive document summarization. In proceedings of the 58th annual meeting of the Association for Computational Linguistics, pp 6209–6219
Woodsend K, Lapata M (2010) Automatic generation of story highlights. In proceedings of the 48th annual meeting of the Association for Computational Linguistics, Sweden, pp 565–574
Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In proceedings of the 32nd international conference on international conference on machine learning - volume 37 (ICML'15), France, pp 2048–2057
Xu J, Durrett G (2019) Neural extractive text summarization with syntactic compression. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, ACL, Hong Kong, China, pp 3292–3303
Yang K, He H, Al-Sabahi K, Zhang Z (2019) EcForest: extractive document summarization through enhanced sentence embedding and cascade forest. Concurr Comput Pract Exp 31(17):e5206. https://doi.org/10.1002/cpe.5206
Article Google Scholar
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the north American chapter of the Association for Computational Linguistics: human language technologies, California, pp 1480–1489
Yao J, Wan X, Xiao J (2017) Recent advances in document summarization. Knowl Inf Syst 53(2):297–336. https://doi.org/10.1007/s10115-017-1042-4
Article Google Scholar
Yao K, Zhang L, Luo T, Wu Y (2018) Deep reinforcement learning for extractive document summarization. Neurocomputing 284:52–62
Article Google Scholar
Yasunaga M, Zhang R, Meelu K, Pareek A, Srinivasan K, Radev D (2017) Graph-based neural multi-document summarization. In proceedings of the 21st conference on computational natural language learning (CoNLL 2017), ACL, Vancouver, Canada, pp 452–462. https://doi.org/10.18653/v1/K17-1045
Yin W, Pei Y (2015) Optimizing sentence modeling and selection for document summarization. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI'15), AAAI Press, Argentina, pp 1383–1389
Zhang X, Lapata M, Wei F, Zhou M (2018) Neural latent extractive document summarization. In proceedings of the 2018 conference on empirical methods in natural language processing, ACL, Brussels, Belgium, pp 779–784. https://doi.org/10.18653/v1/D18-1088
Zhong M, Liu P, Wang D, Qiu X, Huang X (2019) Searching for effective neural extractive summarization: what works and What's next. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp 1049–1058. https://doi.org/10.18653/v1/P19-1100

Download references

Author information

Authors and Affiliations

University Institute of Engineering and Technology, Panjab University, Chandigarh, India
Mahak Gambhir & Vishal Gupta

Authors

Mahak Gambhir
View author publications
You can also search for this author inPubMed Google Scholar
Vishal Gupta
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Vishal Gupta.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gambhir, M., Gupta, V. Deep learning-based extractive text summarization with word-level attention mechanism. Multimed Tools Appl 81, 20829–20852 (2022). https://doi.org/10.1007/s11042-022-12729-y

Download citation

Received: 07 January 2021
Revised: 26 March 2021
Accepted: 21 February 2022
Published: 12 March 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11042-022-12729-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning-based extractive text summarization with word-level attention mechanism

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

Incorporating word attention with convolutional neural networks for abstractive summarization

Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now