Skip to main content
Log in

Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the rapid and unprecedented growth of textual data in recent years, there is a remarkable need for automatic text summarization models to retrieve useful information from these large numbers of textual documents without human intervention within a reasonable time. Text summarization is commonly performed based on extractive and abstractive paradigms. Although different machine learning and deep learning based methods have been proposed for the task of text summarization during the last decades, they are still in their early steps of development and their potential has yet to be fully explored. Accordingly, a new summarization model is proposed in this paper which takes advantage of both extractive and abstractive text summarization models as a single unified model based on the strategy gradient of reinforcement learning. The proposed model also employs the combination of convolutional neural network and gated recurrent unit in both extraction and abstraction modules besides attention mechanism. Moreover, language models, namely Word2Vec and BERT, are used as the backbone of the proposed model to better express sentence semantics as a word vector. We conducted our experiments on widely-studied text summarization datasets (CNN\Daily Mail and DUC-2004) and according to the empirical results, not only the proposed model achieved higher accuracy compared to both extractive and abstractive summarization models in terms of ROUGE metric but also its generated summaries presented higher saliency and readability based on human evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://pypi.org/project/pyrouge/

  2. https://github.com/summanlp/evaluation/tree/master/ROUGE-RELEASE-1.5.5

  3. www.github.com/abisee/pointer-generatorhttps://github.com/nlpyang/PreSumm

  4. https://github.com/nlpyang/PreSumm

  5. https://stanfordnlp.github.io/CoreNLP/

  6. https://github.com/huggingface/pytorch-transformers

References

  1. Abualigah L, Bashabsheh MQ, Alabool H, Shehab M (2020) Text summarization: a brief review. In: Abd Elaziz M, Al-qaness M, Ewees A, Dahou A (eds) Recent Advances in NLP: The Case of Arabic Language. Studies in Computational Intelligence, vol 874. Springer, Cham. https://doi.org/10.1007/978-3-030-34614-0_1

  2. Aksenov D, Moreno-Schneider J, Bourgonje P, Schwarzenberg R, Hennig L, Rehm G (2020) Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling, arXiv preprint arXiv:2003.13027

  3. Aliakbarpour H, Manzuri MT, Rahmani AM (2022) Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism. J Supercomput 78:2528–2555. https://doi.org/10.1007/s11227-021-03950-x

  4. Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212

    Article  Google Scholar 

  5. Böhm F, Gao Y, Meyer CM, Shapira O, Dagan I, Gurevych I (2019) Better rewards yield better summaries: Learning to summarise without references, arXiv preprint arXiv:1909.01214

  6. Cao B et al (2021) Unsupervised Derivation of Keyword Summary for Short Texts. ACM Trans Internet Technol (TOIT) 21(2):1–23

    Article  Google Scholar 

  7. Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 829–833

  8. Celikyilmaz A, Bosselut A, He X, Choi Y (2018) Deep communicating agents for abstractive summarization, arXiv preprint arXiv:1803.10357

  9. Chali Y, Mahmud A (2021) Query-Based Summarization using Reinforcement Learning and Transformer Model, in 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 129–136: IEEE

  10. Chen F, Xia J, Gao H, Xu H, Wei W (2021) TRG-DAtt: the target relational graph and double attention network based sentiment analysis and prediction for supporting decision making. ACM Trans Manag Inf Syst (TMIS) 13(1):1–25

    Google Scholar 

  11. Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words, arXiv preprint arXiv:1603.07252

  12. Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98

  13. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805

  14. Dey M, Das D (2020) A Deep Dive into Supervised Extractive and Abstractive Summarization from Text, in Data Visualization and Knowledge Engineering. Springer, pp. 109–132

  15. Fujita T, Luo Z, Quan C, Mori K (2020) Simplification of RNN and Its Performance Evaluation in Machine Translation. Trans Inst Syst Control Inf Eng 33(10):267–274

    Google Scholar 

  16. Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning, arXiv preprint arXiv:1603.06393

  17. Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y (2016) Pointing the unknown words, arXiv preprint arXiv:1603.08148

  18. Howard J, Ruder S (2018) Universal language model fine-tuning for text classification, arXiv preprint arXiv:1801.06146

  19. Hsu W-T, Lin C-K, Lee M-Y, Min K, Tang J, Sun M (2018) A unified model for extractive and abstractive summarization using inconsistency loss, arXiv preprint arXiv:1805.06266

  20. Joshi A, Fernández E, Alegre E (2018) Deep Learning based Text Summarization: Approaches Databases and Evaluation Measures. in International Conference of Applications of Intelligent Systems

  21. Keneshloo Y, Ramakrishnan N, Reddy CK (2019) Deep transfer reinforcement learning for text summarization, in Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 675–683: SIAM

  22. Li P, Bing L, Lam W (2018) Actor-critic based training framework for abstractive summarization, arXiv preprint arXiv:1803.11070

  23. Li Z, Peng Z, Tang S, Zhang C, Ma H (2020) Text Summarization Method Based on Double Attention Pointer Network. IEEE Access 8:11279–11288

    Article  Google Scholar 

  24. Lin C-Y (2004) ROUGE: A Package for Automatic Evaluation of Summaries, in Association for Computational Linguistic, Barcelona, Spain

  25. Litvak M, Last M (2008) Graph-based keyword extraction for single-document summarization, in Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization, pp. 17-24

  26. Liu Y, Lapata M (2019) Text summarization with pretrained encoders, arXiv preprint arXiv:1908.08345

  27. Lopyrev K (2015) Generating news headlines with recurrent neural networks, arXiv preprint arXiv:1512.01712

  28. Magdum P, Rathi S (2020) A Survey on Deep Learning-Based Automatic Text Summarization Models, in Advances in Artificial Intelligence and Data Engineering. Springer, pp. 377–392

  29. Mahajani A, Pandya V, Maria I, Sharma D (2019) A Comprehensive Survey on Extractive and Abstractive Techniques for Text Summarization, in Ambient Communications and Computer Systems. Springer, pp. 339–351

  30. Mehta P, Majumder P (2019) From extractive to abstractive summarization: A journey. Springer

  31. Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed Representations of Words and Phrases and their Compositionality, Nips

  32. Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents, in Thirty-First AAAI Conference on Artificial Intelligence

  33. Nallapati R, Zhou B, Gulcehre C, Xiang B (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond, arXiv preprint arXiv:1602.06023

  34. Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning, arXiv preprint arXiv:1802.08636

  35. Over P, Dang H, Harman D (2007) DUC in context. Inf Process Manage 43(6):1506–1520

    Article  Google Scholar 

  36. Paulus R, Xiong C, Socher R (2017) A deep reinforced model for abstractive summarization, arXiv preprint arXiv:1705.04304

  37. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation, in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543

  38. Peters ME et al (2018) Deep contextualized word representations, arXiv preprint arXiv:1802.05365

  39. Radford A, Narasimhan K (2018) Improving language understanding by generative pre-training 49-suleiman, Dima, and Arafat Awajan. In: Deep learning based abstractive text summarization: approaches, datasets, evaluation measures, and challenges. Math Probl Eng 2020(2020):1–29

  40. Rane N, Govilkar S (2019) Recent Trends in Deep Learning Based Abstractive Text Summarization. Int J Recent Technol Eng (IJRTE) 8(3):3108–3115

    Article  Google Scholar 

  41. Ren P et al (2018) Sentence relations for extractive summarization with deep neural networks. ACM Trans Inf Syst (TOIS) 36(4):1–32

    Article  MathSciNet  Google Scholar 

  42. Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization, arXiv preprint arXiv:1509.00685

  43. Sadr H, Pedram MM, Teshnehlab M (2019) A Robust Sentiment Analysis Method Based on Sequential Combination of Convolutional and Recursive Neural Networks. Neural Process Lett 1–17

  44. Sadr H, Pedram MM, Teshnehlab M (2020) Multi-View Deep Network: A Deep Model Based on Learning Features From Heterogeneous Neural Networks for Sentiment Analysis. IEEE Access 8:86984–86997

    Article  Google Scholar 

  45. See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks, arXiv preprint arXiv:1704.04368

  46. Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK (2021) Neural abstractive text summarization with sequence-to-sequence models. ACM Trans Data Sci 2(1):1–37

    Article  Google Scholar 

  47. Shirwandkar NS, Kulkarni S (2018) Extractive Text Summarization Using Deep Learning, in 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–5: IEEE

  48. Song S, Huang H, Ruan T (2019) Abstractive text summarization using LSTM-CNN based deep learning. Multimed Tools Appl 78(1):857–875

    Article  Google Scholar 

  49. Suleiman D, Awajan A (2020) Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges. Math Probl Eng 2020

  50. Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune BERT for text classification?. In: Sun M, Huang X, Ji H, Liu Z, Liu Y (eds) Chinese computational linguistics. CCL 2019. Lecture Notes in Computer Science, vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_16

  51. Suzuki J, Nagata M (2016) Cutting-off redundant repeating generations for neural abstractive summarization, arXiv preprint arXiv:1701.00138

  52. Tomer M, Rathie D, Kumar M (2022) Ensembled Approach for Text Summarization, in International Conference on Innovative Computing and Communications, pp. 709–719: Springer

  53. Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in Neural Information Processing Systems (NIPS 2015), pp 2692–2700

  54. Wang Q, Ren J (2021) Summary-aware attention for social media short text abstractive summarization. Neurocomputing 425:290–299

    Article  Google Scholar 

  55. Wong K-F, Wu M, Li W (2008) Extractive summarization using supervised and semi-supervised learning, in Proceedings of the 22nd international conference on computational linguistics (Coling 2008), pp. 985–992

  56. Xiang X, Xu G, Fu X, Wei Y, Jin L, Wang L (2018) Skeleton to Abstraction: An Attentive Information Extraction Schema for Enhancing the Saliency of Text Summarization. Information 9(9):217

    Article  Google Scholar 

  57. Xu Y, Li L, Gao H, Hei L, Li R, Wang Y (2021) Sentiment classification with adversarial learning and attention mechanism. Comput Intell 37(2):774–798

    Article  MathSciNet  Google Scholar 

  58. Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2018) Dual encoding for abstractive text summarization. IEEE Trans Cybern 50(3):985–996

  59. Yin D, Meng T, Chang K-W (2020) SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics, arXiv preprint arXiv:2005.04114

  60. Yoon W, Yeo YS, Jeong M, Yi B-J, Kang J (2020) Learning by Semantic Similarity Makes Abstractive Summarization Better, arXiv preprint arXiv:2002.07767

  61. Yousefi-Azar M, Hamey L (2017) Text summarization using unsupervised deep learning. Expert Syst Appl 68:93–105

    Article  Google Scholar 

  62. Zhang Y, Er MJ, Zhao R, Pratama M (2016) Multiview convolutional neural networks for multidocument extractive summarization. IEEE Trans Cybern 47(10):3230–3242

    Article  Google Scholar 

  63. Zhang B, Xiong D, Xie J, Su J (2020) Neural machine translation with GRU-gated attention model. IEEE Trans Neural Netw Learn Syst 31(11):4688–4698

    Article  Google Scholar 

  64. Zhang H, Xu J, Wang J (2019) Pretraining-based natural language generation for text summarization, arXiv preprint arXiv:1902.09243

  65. Zhao H, Cao J, Xu M, Lu J (2020) Variational neural decoder for abstractive text summarization. Comput Sci Inf Syst 00:12–12

    Google Scholar 

  66. Zheng S, Yang M (2019) A new method of improving BERT for text classification. In: Intelligence science and big data engineering. Big data and machine learning: 9th international conference, IScIDE 2019, Nanjing, China, October 17–20, 2019, Proceedings, Part II 9. Springer International Publishing, Cham, pp 442–452

  67. Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences, arXiv preprint arXiv:1807.02305

  68. Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2020) A joint sentence scoring and selection framework for neural extractive document summarization. IEEE/ACM Trans Audio Speech Lang Process 28:671–681

    Article  Google Scholar 

  69. Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization, arXiv preprint arXiv:1704.07073

  70. Zhu X, Guo K, Fang H, Chen L, Ren S, Hu B (2021) Cross view capture for stereo image super-resolution. IEEE Trans Multimed 24:3074–3086. https://doi.org/10.1109/TMM.2021.3092571

  71. Zhu X, Guo K, Ren S, Hu B, Hu M, Fang H (2021) Lightweight image super-resolution with expectation-maximization attention mechanism. IEEE Trans Circuits Syst Video Technol 32(3):1273–1284

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Taghi Manzuri.

Ethics declarations

Conflict of interest

Author Hassan Aliakbarpour, Author Mohammad Taghi Manzuri, and Author Amir Masoud Rahmani declare that they have no confict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aliakbarpour, H., Manzuri, M.T. & Rahmani, A.M. Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism. Multimed Tools Appl 83, 733–762 (2024). https://doi.org/10.1007/s11042-023-15589-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15589-2

Keywords

Navigation