Abstract
Most sequence-to-sequence abstractive summarization models generate the summaries based on the source article and the generated words, but they often neglect the future information implied in the un-generated words, which means that they lack the ability of “lookahead.” In this paper, we present a novel summarization model with “lookahead” ability to fully employ the implied future information. Our model takes two steps: (1) in the first step, an asynchronous decoder model with a no ground truth guiding backward decoder that explicitly produces and exploits the future information is trained. (2) in the inference process, in addition to the joint probability of the generated sequence, an enriched-information decoding method is proposed to further take future ROUGE reward of the un-generated words into account. Furthermore, the future ROUGE reward is predicted by a novel reward-predict model, and it takes the hidden states of the pre-trained asynchronous decoder model as input. Experimental results show that our two-step summarization model achieves new state-of-the-art results on CNN/Daily Mail dataset and the generalization of our model on test-only DUC-2002 datasets achieves higher scores than the state-of-the-art model.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Preprint. arXiv:abs/1409.0473
Banko M, Mittal VO, Witbrock MJ (2000) Headline generation based on statistical translation. In: Proceedings of the 38th annual meeting of the association for computational linguistics, pp. 318–325. Association for Computational Linguistics, Hong Kong
Çelikyilmaz A, Bosselut A, He X, Choi Y (2018) Deep communicating agents for abstractive summarization. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 1: long papers, pp 1662–1675
Chen Y, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: long papers, pp 675–686
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 93–98. Association for Computational Linguistics, San Diego, California
Clarke J, Lapata M (2008) Global inference for sentence compression an integer linear programming approach. J Artif Intell Res 31(1):399–429
Denkowski M, Lavie A (2014) Meteor universal: Language specific translation evaluation for any target language. In: Proceedings of the ninth workshop on statistical machine translation, pp 376–380. Association for Computational Linguistics, Baltimore, Maryland, USA
Filippova K, Alfonseca E, Colmenares CA, Kaiser L, Vinyals O (2015) Sentence compression by deletion with LSTMs. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 360–368. Association for Computational Linguistics, Lisbon, Portugal
Gehrmann S, Deng Y, Rush A (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4098–4109. Association for Computational Linguistics, Brussels, Belgium
Gu J, Lu Z, Li H, Li VOK (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: long papers
Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: long papers, pp 687–697
He D, Lu H, Xia Y, Qin T, Wang L, Liu TY (2017) Decoding with value networks for neural machine translation. Adv Neural Inf Process Syst 30:178–187
Henß S, Mieskes M, Gurevych I (2015) A reinforcement learning approach for adaptive single- and multi-document summarization. In: Proceedings of the international conference of the german society for computational linguistics and language technology, GSCL 2015, University of Duisburg-Essen, Germany, pp 3–12
Hoang CDV, Haffari G, Cohn T (2017) Towards decoding as continuous optimisation in neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 146–156. Association for Computational Linguistics, Copenhagen, Denmark
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jing H, McKeown KR (2000) Cut and paste based text summarization. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference, NAACL 2000, pp 178–185. Association for Computational Linguistics, Stroudsburg, PA, USA
Knight K, Marcu D (2002) Summarization beyond sentence extraction: a probabilistic approach to sentence compression. J Artif Intell Res 139(1):91–107
Li C, Xu W, Li S, Gao S (2018) Guiding generation for abstractive text summarization based on key information guide network. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 2 (short papers), pp 55–60. Association for Computational Linguistics, New Orleans, Louisiana
Li W, Xiao X, Lyu Y, Wang Y (2018) Improving neural abstractive document summarization with explicit information selection modeling. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 1787–1796. Association for Computational Linguistics, Brussels, Belgium
Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Text summarization branches out: Proceedings of the ACL-04 workshop, pp 74–81. Association for Computational Linguistics, Barcelona, Spain
Liu CW, Lowe R, Serban I, Noseworthy M, Charlin L, Pineau J (2016) How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2122–2132. Association for Computational Linguistics, Austin, Texas
Nallapati R, Zhou B, dos Santos CN, Gülçehre Ç, Xiang B (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, CoNLL 2016, Berlin, Germany, August 11–12, 2016, pp 280–290
Pasunuru R, Bansal M (2018) Multi-reward reinforced summarization with saliency and entailment. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (short papers), pp 646–653
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: Proceedings of 6th international conference on learning representations, Vancouver, BC, Canada
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 379–389
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, pp 1073–1083
Sennrich R, Haddow B, Birch A (2016) Edinburgh neural machine translation systems for WMT 16. In: Proceedings of the first conference on machine translation, WMT 2016, colocated with ACL 2016, August 11–12, Berlin, Germany, pp 371–376
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap TP, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 3104–3112
Takase S, Suzuki J, Okazaki N, Hirao T, Nagata M (2016) Neural headline generation on abstract meaning representation. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, Texas, pp 1054–1059
Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 1171–1181
Wang C, Yang H, Meinel C (2018) Image captioning with deep bidirectional lstms and multi-task learning. ACM Trans Multimed Comput Commun Appl 14(2s):40:1–40:20
Zajic D, Dorr BJ, Schwartz R (2004) Bbn/umd at duc-2004: topiary. In: Proceedings of the HLT-NAACL 2004 document understanding workshop, Boston, pp 112–119
Zhang X, Su J, Qin Y, Liu Y, Ji R, Wang H (2018) Asynchronous bidirectional decoding for neural machine translation. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, pp 5698–5705
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, S., Xu, J. A two-step abstractive summarization model with asynchronous and enriched-information decoding. Neural Comput & Applic 33, 1159–1170 (2021). https://doi.org/10.1007/s00521-020-05005-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05005-3