Abstract
With the rapid and unprecedented growth of textual data in recent years, there is a remarkable need for automatic text summarization models to retrieve useful information from these large numbers of textual documents without human intervention within a reasonable time. Text summarization is commonly performed based on extractive and abstractive paradigms. Although different machine learning and deep learning based methods have been proposed for the task of text summarization during the last decades, they are still in their early steps of development and their potential has yet to be fully explored. Accordingly, a new summarization model is proposed in this paper which takes advantage of both extractive and abstractive text summarization models as a single unified model based on the strategy gradient of reinforcement learning. The proposed model also employs the combination of convolutional neural network and gated recurrent unit in both extraction and abstraction modules besides attention mechanism. Moreover, language models, namely Word2Vec and BERT, are used as the backbone of the proposed model to better express sentence semantics as a word vector. We conducted our experiments on widely-studied text summarization datasets (CNN\Daily Mail and DUC-2004) and according to the empirical results, not only the proposed model achieved higher accuracy compared to both extractive and abstractive summarization models in terms of ROUGE metric but also its generated summaries presented higher saliency and readability based on human evaluation.
Similar content being viewed by others
Notes
References
Abualigah L, Bashabsheh MQ, Alabool H, Shehab M (2020) Text summarization: a brief review. In: Abd Elaziz M, Al-qaness M, Ewees A, Dahou A (eds) Recent Advances in NLP: The Case of Arabic Language. Studies in Computational Intelligence, vol 874. Springer, Cham. https://doi.org/10.1007/978-3-030-34614-0_1
Aksenov D, Moreno-Schneider J, Bourgonje P, Schwarzenberg R, Hennig L, Rehm G (2020) Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling, arXiv preprint arXiv:2003.13027
Aliakbarpour H, Manzuri MT, Rahmani AM (2022) Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism. J Supercomput 78:2528–2555. https://doi.org/10.1007/s11227-021-03950-x
Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212
Böhm F, Gao Y, Meyer CM, Shapira O, Dagan I, Gurevych I (2019) Better rewards yield better summaries: Learning to summarise without references, arXiv preprint arXiv:1909.01214
Cao B et al (2021) Unsupervised Derivation of Keyword Summary for Short Texts. ACM Trans Internet Technol (TOIT) 21(2):1–23
Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 829–833
Celikyilmaz A, Bosselut A, He X, Choi Y (2018) Deep communicating agents for abstractive summarization, arXiv preprint arXiv:1803.10357
Chali Y, Mahmud A (2021) Query-Based Summarization using Reinforcement Learning and Transformer Model, in 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 129–136: IEEE
Chen F, Xia J, Gao H, Xu H, Wei W (2021) TRG-DAtt: the target relational graph and double attention network based sentiment analysis and prediction for supporting decision making. ACM Trans Manag Inf Syst (TMIS) 13(1):1–25
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words, arXiv preprint arXiv:1603.07252
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805
Dey M, Das D (2020) A Deep Dive into Supervised Extractive and Abstractive Summarization from Text, in Data Visualization and Knowledge Engineering. Springer, pp. 109–132
Fujita T, Luo Z, Quan C, Mori K (2020) Simplification of RNN and Its Performance Evaluation in Machine Translation. Trans Inst Syst Control Inf Eng 33(10):267–274
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning, arXiv preprint arXiv:1603.06393
Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y (2016) Pointing the unknown words, arXiv preprint arXiv:1603.08148
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification, arXiv preprint arXiv:1801.06146
Hsu W-T, Lin C-K, Lee M-Y, Min K, Tang J, Sun M (2018) A unified model for extractive and abstractive summarization using inconsistency loss, arXiv preprint arXiv:1805.06266
Joshi A, Fernández E, Alegre E (2018) Deep Learning based Text Summarization: Approaches Databases and Evaluation Measures. in International Conference of Applications of Intelligent Systems
Keneshloo Y, Ramakrishnan N, Reddy CK (2019) Deep transfer reinforcement learning for text summarization, in Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 675–683: SIAM
Li P, Bing L, Lam W (2018) Actor-critic based training framework for abstractive summarization, arXiv preprint arXiv:1803.11070
Li Z, Peng Z, Tang S, Zhang C, Ma H (2020) Text Summarization Method Based on Double Attention Pointer Network. IEEE Access 8:11279–11288
Lin C-Y (2004) ROUGE: A Package for Automatic Evaluation of Summaries, in Association for Computational Linguistic, Barcelona, Spain
Litvak M, Last M (2008) Graph-based keyword extraction for single-document summarization, in Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization, pp. 17-24
Liu Y, Lapata M (2019) Text summarization with pretrained encoders, arXiv preprint arXiv:1908.08345
Lopyrev K (2015) Generating news headlines with recurrent neural networks, arXiv preprint arXiv:1512.01712
Magdum P, Rathi S (2020) A Survey on Deep Learning-Based Automatic Text Summarization Models, in Advances in Artificial Intelligence and Data Engineering. Springer, pp. 377–392
Mahajani A, Pandya V, Maria I, Sharma D (2019) A Comprehensive Survey on Extractive and Abstractive Techniques for Text Summarization, in Ambient Communications and Computer Systems. Springer, pp. 339–351
Mehta P, Majumder P (2019) From extractive to abstractive summarization: A journey. Springer
Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed Representations of Words and Phrases and their Compositionality, Nips
Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents, in Thirty-First AAAI Conference on Artificial Intelligence
Nallapati R, Zhou B, Gulcehre C, Xiang B (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond, arXiv preprint arXiv:1602.06023
Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning, arXiv preprint arXiv:1802.08636
Over P, Dang H, Harman D (2007) DUC in context. Inf Process Manage 43(6):1506–1520
Paulus R, Xiong C, Socher R (2017) A deep reinforced model for abstractive summarization, arXiv preprint arXiv:1705.04304
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation, in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543
Peters ME et al (2018) Deep contextualized word representations, arXiv preprint arXiv:1802.05365
Radford A, Narasimhan K (2018) Improving language understanding by generative pre-training 49-suleiman, Dima, and Arafat Awajan. In: Deep learning based abstractive text summarization: approaches, datasets, evaluation measures, and challenges. Math Probl Eng 2020(2020):1–29
Rane N, Govilkar S (2019) Recent Trends in Deep Learning Based Abstractive Text Summarization. Int J Recent Technol Eng (IJRTE) 8(3):3108–3115
Ren P et al (2018) Sentence relations for extractive summarization with deep neural networks. ACM Trans Inf Syst (TOIS) 36(4):1–32
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization, arXiv preprint arXiv:1509.00685
Sadr H, Pedram MM, Teshnehlab M (2019) A Robust Sentiment Analysis Method Based on Sequential Combination of Convolutional and Recursive Neural Networks. Neural Process Lett 1–17
Sadr H, Pedram MM, Teshnehlab M (2020) Multi-View Deep Network: A Deep Model Based on Learning Features From Heterogeneous Neural Networks for Sentiment Analysis. IEEE Access 8:86984–86997
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks, arXiv preprint arXiv:1704.04368
Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK (2021) Neural abstractive text summarization with sequence-to-sequence models. ACM Trans Data Sci 2(1):1–37
Shirwandkar NS, Kulkarni S (2018) Extractive Text Summarization Using Deep Learning, in 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–5: IEEE
Song S, Huang H, Ruan T (2019) Abstractive text summarization using LSTM-CNN based deep learning. Multimed Tools Appl 78(1):857–875
Suleiman D, Awajan A (2020) Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges. Math Probl Eng 2020
Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune BERT for text classification?. In: Sun M, Huang X, Ji H, Liu Z, Liu Y (eds) Chinese computational linguistics. CCL 2019. Lecture Notes in Computer Science, vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_16
Suzuki J, Nagata M (2016) Cutting-off redundant repeating generations for neural abstractive summarization, arXiv preprint arXiv:1701.00138
Tomer M, Rathie D, Kumar M (2022) Ensembled Approach for Text Summarization, in International Conference on Innovative Computing and Communications, pp. 709–719: Springer
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in Neural Information Processing Systems (NIPS 2015), pp 2692–2700
Wang Q, Ren J (2021) Summary-aware attention for social media short text abstractive summarization. Neurocomputing 425:290–299
Wong K-F, Wu M, Li W (2008) Extractive summarization using supervised and semi-supervised learning, in Proceedings of the 22nd international conference on computational linguistics (Coling 2008), pp. 985–992
Xiang X, Xu G, Fu X, Wei Y, Jin L, Wang L (2018) Skeleton to Abstraction: An Attentive Information Extraction Schema for Enhancing the Saliency of Text Summarization. Information 9(9):217
Xu Y, Li L, Gao H, Hei L, Li R, Wang Y (2021) Sentiment classification with adversarial learning and attention mechanism. Comput Intell 37(2):774–798
Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2018) Dual encoding for abstractive text summarization. IEEE Trans Cybern 50(3):985–996
Yin D, Meng T, Chang K-W (2020) SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics, arXiv preprint arXiv:2005.04114
Yoon W, Yeo YS, Jeong M, Yi B-J, Kang J (2020) Learning by Semantic Similarity Makes Abstractive Summarization Better, arXiv preprint arXiv:2002.07767
Yousefi-Azar M, Hamey L (2017) Text summarization using unsupervised deep learning. Expert Syst Appl 68:93–105
Zhang Y, Er MJ, Zhao R, Pratama M (2016) Multiview convolutional neural networks for multidocument extractive summarization. IEEE Trans Cybern 47(10):3230–3242
Zhang B, Xiong D, Xie J, Su J (2020) Neural machine translation with GRU-gated attention model. IEEE Trans Neural Netw Learn Syst 31(11):4688–4698
Zhang H, Xu J, Wang J (2019) Pretraining-based natural language generation for text summarization, arXiv preprint arXiv:1902.09243
Zhao H, Cao J, Xu M, Lu J (2020) Variational neural decoder for abstractive text summarization. Comput Sci Inf Syst 00:12–12
Zheng S, Yang M (2019) A new method of improving BERT for text classification. In: Intelligence science and big data engineering. Big data and machine learning: 9th international conference, IScIDE 2019, Nanjing, China, October 17–20, 2019, Proceedings, Part II 9. Springer International Publishing, Cham, pp 442–452
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences, arXiv preprint arXiv:1807.02305
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2020) A joint sentence scoring and selection framework for neural extractive document summarization. IEEE/ACM Trans Audio Speech Lang Process 28:671–681
Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization, arXiv preprint arXiv:1704.07073
Zhu X, Guo K, Fang H, Chen L, Ren S, Hu B (2021) Cross view capture for stereo image super-resolution. IEEE Trans Multimed 24:3074–3086. https://doi.org/10.1109/TMM.2021.3092571
Zhu X, Guo K, Ren S, Hu B, Hu M, Fang H (2021) Lightweight image super-resolution with expectation-maximization attention mechanism. IEEE Trans Circuits Syst Video Technol 32(3):1273–1284
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Author Hassan Aliakbarpour, Author Mohammad Taghi Manzuri, and Author Amir Masoud Rahmani declare that they have no confict of interest.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Aliakbarpour, H., Manzuri, M.T. & Rahmani, A.M. Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanism. Multimed Tools Appl 83, 733–762 (2024). https://doi.org/10.1007/s11042-023-15589-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15589-2