Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

Aliakbarpour, Hassan; Manzuri, Mohammad Taghi; Rahmani, Amir Masoud

doi:10.1007/s11042-023-15589-2

Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

Published: 23 May 2023

Volume 83, pages 733–762, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hassan Aliakbarpour¹,
Mohammad Taghi Manzuri ORCID: orcid.org/0000-0002-3451-0338² &
Amir Masoud Rahmani³

183 Accesses
Explore all metrics

Abstract

With the rapid and unprecedented growth of textual data in recent years, there is a remarkable need for automatic text summarization models to retrieve useful information from these large numbers of textual documents without human intervention within a reasonable time. Text summarization is commonly performed based on extractive and abstractive paradigms. Although different machine learning and deep learning based methods have been proposed for the task of text summarization during the last decades, they are still in their early steps of development and their potential has yet to be fully explored. Accordingly, a new summarization model is proposed in this paper which takes advantage of both extractive and abstractive text summarization models as a single unified model based on the strategy gradient of reinforcement learning. The proposed model also employs the combination of convolutional neural network and gated recurrent unit in both extraction and abstraction modules besides attention mechanism. Moreover, language models, namely Word2Vec and BERT, are used as the backbone of the proposed model to better express sentence semantics as a word vector. We conducted our experiments on widely-studied text summarization datasets (CNN\Daily Mail and DUC-2004) and according to the empirical results, not only the proposed model achieved higher accuracy compared to both extractive and abstractive summarization models in terms of ROUGE metric but also its generated summaries presented higher saliency and readability based on human evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FakeBERT: Fake news detection in social media with a BERT-based deep learning approach

Article 07 January 2021

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

TextConvoNet: a convolutional neural network based architecture for text classification

Article 22 October 2022

Notes

References

Abualigah L, Bashabsheh MQ, Alabool H, Shehab M (2020) Text summarization: a brief review. In: Abd Elaziz M, Al-qaness M, Ewees A, Dahou A (eds) Recent Advances in NLP: The Case of Arabic Language. Studies in Computational Intelligence, vol 874. Springer, Cham. https://doi.org/10.1007/978-3-030-34614-0_1
Aksenov D, Moreno-Schneider J, Bourgonje P, Schwarzenberg R, Hennig L, Rehm G (2020) Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling, arXiv preprint arXiv:2003.13027
Aliakbarpour H, Manzuri MT, Rahmani AM (2022) Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism. J Supercomput 78:2528–2555. https://doi.org/10.1007/s11227-021-03950-x
Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212
Article Google Scholar
Böhm F, Gao Y, Meyer CM, Shapira O, Dagan I, Gurevych I (2019) Better rewards yield better summaries: Learning to summarise without references, arXiv preprint arXiv:1909.01214
Cao B et al (2021) Unsupervised Derivation of Keyword Summary for Short Texts. ACM Trans Internet Technol (TOIT) 21(2):1–23
Article Google Scholar
Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 829–833
Celikyilmaz A, Bosselut A, He X, Choi Y (2018) Deep communicating agents for abstractive summarization, arXiv preprint arXiv:1803.10357
Chali Y, Mahmud A (2021) Query-Based Summarization using Reinforcement Learning and Transformer Model, in 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 129–136: IEEE
Chen F, Xia J, Gao H, Xu H, Wei W (2021) TRG-DAtt: the target relational graph and double attention network based sentiment analysis and prediction for supporting decision making. ACM Trans Manag Inf Syst (TMIS) 13(1):1–25
Google Scholar
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words, arXiv preprint arXiv:1603.07252
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805
Dey M, Das D (2020) A Deep Dive into Supervised Extractive and Abstractive Summarization from Text, in Data Visualization and Knowledge Engineering. Springer, pp. 109–132
Fujita T, Luo Z, Quan C, Mori K (2020) Simplification of RNN and Its Performance Evaluation in Machine Translation. Trans Inst Syst Control Inf Eng 33(10):267–274
Google Scholar
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning, arXiv preprint arXiv:1603.06393
Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y (2016) Pointing the unknown words, arXiv preprint arXiv:1603.08148
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification, arXiv preprint arXiv:1801.06146
Hsu W-T, Lin C-K, Lee M-Y, Min K, Tang J, Sun M (2018) A unified model for extractive and abstractive summarization using inconsistency loss, arXiv preprint arXiv:1805.06266
Joshi A, Fernández E, Alegre E (2018) Deep Learning based Text Summarization: Approaches Databases and Evaluation Measures. in International Conference of Applications of Intelligent Systems
Keneshloo Y, Ramakrishnan N, Reddy CK (2019) Deep transfer reinforcement learning for text summarization, in Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 675–683: SIAM
Li P, Bing L, Lam W (2018) Actor-critic based training framework for abstractive summarization, arXiv preprint arXiv:1803.11070
Li Z, Peng Z, Tang S, Zhang C, Ma H (2020) Text Summarization Method Based on Double Attention Pointer Network. IEEE Access 8:11279–11288
Article Google Scholar
Lin C-Y (2004) ROUGE: A Package for Automatic Evaluation of Summaries, in Association for Computational Linguistic, Barcelona, Spain
Litvak M, Last M (2008) Graph-based keyword extraction for single-document summarization, in Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization, pp. 17-24
Liu Y, Lapata M (2019) Text summarization with pretrained encoders, arXiv preprint arXiv:1908.08345
Lopyrev K (2015) Generating news headlines with recurrent neural networks, arXiv preprint arXiv:1512.01712
Magdum P, Rathi S (2020) A Survey on Deep Learning-Based Automatic Text Summarization Models, in Advances in Artificial Intelligence and Data Engineering. Springer, pp. 377–392
Mahajani A, Pandya V, Maria I, Sharma D (2019) A Comprehensive Survey on Extractive and Abstractive Techniques for Text Summarization, in Ambient Communications and Computer Systems. Springer, pp. 339–351
Mehta P, Majumder P (2019) From extractive to abstractive summarization: A journey. Springer
Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed Representations of Words and Phrases and their Compositionality, Nips
Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents, in Thirty-First AAAI Conference on Artificial Intelligence
Nallapati R, Zhou B, Gulcehre C, Xiang B (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond, arXiv preprint arXiv:1602.06023
Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning, arXiv preprint arXiv:1802.08636
Over P, Dang H, Harman D (2007) DUC in context. Inf Process Manage 43(6):1506–1520
Article Google Scholar
Paulus R, Xiong C, Socher R (2017) A deep reinforced model for abstractive summarization, arXiv preprint arXiv:1705.04304
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation, in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543
Peters ME et al (2018) Deep contextualized word representations, arXiv preprint arXiv:1802.05365
Radford A, Narasimhan K (2018) Improving language understanding by generative pre-training 49-suleiman, Dima, and Arafat Awajan. In: Deep learning based abstractive text summarization: approaches, datasets, evaluation measures, and challenges. Math Probl Eng 2020(2020):1–29
Rane N, Govilkar S (2019) Recent Trends in Deep Learning Based Abstractive Text Summarization. Int J Recent Technol Eng (IJRTE) 8(3):3108–3115
Article Google Scholar
Ren P et al (2018) Sentence relations for extractive summarization with deep neural networks. ACM Trans Inf Syst (TOIS) 36(4):1–32
Article MathSciNet Google Scholar
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization, arXiv preprint arXiv:1509.00685
Sadr H, Pedram MM, Teshnehlab M (2019) A Robust Sentiment Analysis Method Based on Sequential Combination of Convolutional and Recursive Neural Networks. Neural Process Lett 1–17
Sadr H, Pedram MM, Teshnehlab M (2020) Multi-View Deep Network: A Deep Model Based on Learning Features From Heterogeneous Neural Networks for Sentiment Analysis. IEEE Access 8:86984–86997
Article Google Scholar
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks, arXiv preprint arXiv:1704.04368
Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK (2021) Neural abstractive text summarization with sequence-to-sequence models. ACM Trans Data Sci 2(1):1–37
Article Google Scholar
Shirwandkar NS, Kulkarni S (2018) Extractive Text Summarization Using Deep Learning, in 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–5: IEEE
Song S, Huang H, Ruan T (2019) Abstractive text summarization using LSTM-CNN based deep learning. Multimed Tools Appl 78(1):857–875
Article Google Scholar
Suleiman D, Awajan A (2020) Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges. Math Probl Eng 2020
Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune BERT for text classification?. In: Sun M, Huang X, Ji H, Liu Z, Liu Y (eds) Chinese computational linguistics. CCL 2019. Lecture Notes in Computer Science, vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_16
Suzuki J, Nagata M (2016) Cutting-off redundant repeating generations for neural abstractive summarization, arXiv preprint arXiv:1701.00138
Tomer M, Rathie D, Kumar M (2022) Ensembled Approach for Text Summarization, in International Conference on Innovative Computing and Communications, pp. 709–719: Springer
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in Neural Information Processing Systems (NIPS 2015), pp 2692–2700
Wang Q, Ren J (2021) Summary-aware attention for social media short text abstractive summarization. Neurocomputing 425:290–299
Article Google Scholar
Wong K-F, Wu M, Li W (2008) Extractive summarization using supervised and semi-supervised learning, in Proceedings of the 22nd international conference on computational linguistics (Coling 2008), pp. 985–992
Xiang X, Xu G, Fu X, Wei Y, Jin L, Wang L (2018) Skeleton to Abstraction: An Attentive Information Extraction Schema for Enhancing the Saliency of Text Summarization. Information 9(9):217
Article Google Scholar
Xu Y, Li L, Gao H, Hei L, Li R, Wang Y (2021) Sentiment classification with adversarial learning and attention mechanism. Comput Intell 37(2):774–798
Article MathSciNet Google Scholar
Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2018) Dual encoding for abstractive text summarization. IEEE Trans Cybern 50(3):985–996
Yin D, Meng T, Chang K-W (2020) SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics, arXiv preprint arXiv:2005.04114
Yoon W, Yeo YS, Jeong M, Yi B-J, Kang J (2020) Learning by Semantic Similarity Makes Abstractive Summarization Better, arXiv preprint arXiv:2002.07767
Yousefi-Azar M, Hamey L (2017) Text summarization using unsupervised deep learning. Expert Syst Appl 68:93–105
Article Google Scholar
Zhang Y, Er MJ, Zhao R, Pratama M (2016) Multiview convolutional neural networks for multidocument extractive summarization. IEEE Trans Cybern 47(10):3230–3242
Article Google Scholar
Zhang B, Xiong D, Xie J, Su J (2020) Neural machine translation with GRU-gated attention model. IEEE Trans Neural Netw Learn Syst 31(11):4688–4698
Article Google Scholar
Zhang H, Xu J, Wang J (2019) Pretraining-based natural language generation for text summarization, arXiv preprint arXiv:1902.09243
Zhao H, Cao J, Xu M, Lu J (2020) Variational neural decoder for abstractive text summarization. Comput Sci Inf Syst 00:12–12
Google Scholar
Zheng S, Yang M (2019) A new method of improving BERT for text classification. In: Intelligence science and big data engineering. Big data and machine learning: 9th international conference, IScIDE 2019, Nanjing, China, October 17–20, 2019, Proceedings, Part II 9. Springer International Publishing, Cham, pp 442–452
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences, arXiv preprint arXiv:1807.02305
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2020) A joint sentence scoring and selection framework for neural extractive document summarization. IEEE/ACM Trans Audio Speech Lang Process 28:671–681
Article Google Scholar
Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization, arXiv preprint arXiv:1704.07073
Zhu X, Guo K, Fang H, Chen L, Ren S, Hu B (2021) Cross view capture for stereo image super-resolution. IEEE Trans Multimed 24:3074–3086. https://doi.org/10.1109/TMM.2021.3092571
Zhu X, Guo K, Ren S, Hu B, Hu M, Fang H (2021) Lightweight image super-resolution with expectation-maximization attention mechanism. IEEE Trans Circuits Syst Video Technol 32(3):1273–1284

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
Hassan Aliakbarpour
Computer Engineering Department, Sharif University of Technology, Tehran, Iran
Mohammad Taghi Manzuri
Future Technology Research Center, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin, 64002, Taiwan
Amir Masoud Rahmani

Authors

Hassan Aliakbarpour
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Taghi Manzuri
View author publications
You can also search for this author in PubMed Google Scholar
Amir Masoud Rahmani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Taghi Manzuri.

Ethics declarations

Conflict of interest

Author Hassan Aliakbarpour, Author Mohammad Taghi Manzuri, and Author Amir Masoud Rahmani declare that they have no confict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aliakbarpour, H., Manzuri, M.T. & Rahmani, A.M. Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism. Multimed Tools Appl 83, 733–762 (2024). https://doi.org/10.1007/s11042-023-15589-2

Download citation

Received: 30 August 2021
Revised: 29 April 2022
Accepted: 21 April 2023
Published: 23 May 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15589-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

Abstract

Access this article

Similar content being viewed by others

FakeBERT: Fake news detection in social media with a BERT-based deep learning approach

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism

Abstract

Access this article

Similar content being viewed by others

FakeBERT: Fake news detection in social media with a BERT-based deep learning approach

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation