A two-step abstractive summarization model with asynchronous and enriched-information decoding

Li, Shuaimin; Xu, Jungang

doi:10.1007/s00521-020-05005-3

A two-step abstractive summarization model with asynchronous and enriched-information decoding

Original Article
Published: 02 June 2020

Volume 33, pages 1159–1170, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

399 Accesses
Explore all metrics

Abstract

Most sequence-to-sequence abstractive summarization models generate the summaries based on the source article and the generated words, but they often neglect the future information implied in the un-generated words, which means that they lack the ability of “lookahead.” In this paper, we present a novel summarization model with “lookahead” ability to fully employ the implied future information. Our model takes two steps: (1) in the first step, an asynchronous decoder model with a no ground truth guiding backward decoder that explicitly produces and exploits the future information is trained. (2) in the inference process, in addition to the joint probability of the generated sequence, an enriched-information decoding method is proposed to further take future ROUGE reward of the un-generated words into account. Furthermore, the future ROUGE reward is predicted by a novel reward-predict model, and it takes the hidden states of the pre-trained asynchronous decoder model as input. Experimental results show that our two-step summarization model achieves new state-of-the-art results on CNN/Daily Mail dataset and the generalization of our model on test-only DUC-2002 datasets achieves higher scores than the state-of-the-art model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Abstractive Summarization Model with Adaptive Sparsemax

TEA: Topic Information based Extractive-Abstractive Fusion Model for Long Text Summary

Article 14 November 2023

Abstractive document summarization via multi-template decoding

Article 08 January 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

References

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Preprint. arXiv:abs/1409.0473
Banko M, Mittal VO, Witbrock MJ (2000) Headline generation based on statistical translation. In: Proceedings of the 38th annual meeting of the association for computational linguistics, pp. 318–325. Association for Computational Linguistics, Hong Kong
Çelikyilmaz A, Bosselut A, He X, Choi Y (2018) Deep communicating agents for abstractive summarization. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 1: long papers, pp 1662–1675
Chen Y, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: long papers, pp 675–686
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 93–98. Association for Computational Linguistics, San Diego, California
Clarke J, Lapata M (2008) Global inference for sentence compression an integer linear programming approach. J Artif Intell Res 31(1):399–429
Article Google Scholar
Denkowski M, Lavie A (2014) Meteor universal: Language specific translation evaluation for any target language. In: Proceedings of the ninth workshop on statistical machine translation, pp 376–380. Association for Computational Linguistics, Baltimore, Maryland, USA
Filippova K, Alfonseca E, Colmenares CA, Kaiser L, Vinyals O (2015) Sentence compression by deletion with LSTMs. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 360–368. Association for Computational Linguistics, Lisbon, Portugal
Gehrmann S, Deng Y, Rush A (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4098–4109. Association for Computational Linguistics, Brussels, Belgium
Gu J, Lu Z, Li H, Li VOK (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: long papers
Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: long papers, pp 687–697
He D, Lu H, Xia Y, Qin T, Wang L, Liu TY (2017) Decoding with value networks for neural machine translation. Adv Neural Inf Process Syst 30:178–187
Google Scholar
Henß S, Mieskes M, Gurevych I (2015) A reinforcement learning approach for adaptive single- and multi-document summarization. In: Proceedings of the international conference of the german society for computational linguistics and language technology, GSCL 2015, University of Duisburg-Essen, Germany, pp 3–12
Hoang CDV, Haffari G, Cohn T (2017) Towards decoding as continuous optimisation in neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 146–156. Association for Computational Linguistics, Copenhagen, Denmark
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Jing H, McKeown KR (2000) Cut and paste based text summarization. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference, NAACL 2000, pp 178–185. Association for Computational Linguistics, Stroudsburg, PA, USA
Knight K, Marcu D (2002) Summarization beyond sentence extraction: a probabilistic approach to sentence compression. J Artif Intell Res 139(1):91–107
Article Google Scholar
Li C, Xu W, Li S, Gao S (2018) Guiding generation for abstractive text summarization based on key information guide network. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 2 (short papers), pp 55–60. Association for Computational Linguistics, New Orleans, Louisiana
Li W, Xiao X, Lyu Y, Wang Y (2018) Improving neural abstractive document summarization with explicit information selection modeling. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 1787–1796. Association for Computational Linguistics, Brussels, Belgium
Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Text summarization branches out: Proceedings of the ACL-04 workshop, pp 74–81. Association for Computational Linguistics, Barcelona, Spain
Liu CW, Lowe R, Serban I, Noseworthy M, Charlin L, Pineau J (2016) How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2122–2132. Association for Computational Linguistics, Austin, Texas
Nallapati R, Zhou B, dos Santos CN, Gülçehre Ç, Xiang B (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, CoNLL 2016, Berlin, Germany, August 11–12, 2016, pp 280–290
Pasunuru R, Bansal M (2018) Multi-reward reinforced summarization with saliency and entailment. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (short papers), pp 646–653
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: Proceedings of 6th international conference on learning representations, Vancouver, BC, Canada
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 379–389
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, pp 1073–1083
Sennrich R, Haddow B, Birch A (2016) Edinburgh neural machine translation systems for WMT 16. In: Proceedings of the first conference on machine translation, WMT 2016, colocated with ACL 2016, August 11–12, Berlin, Germany, pp 371–376
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap TP, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Article Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 3104–3112
Takase S, Suzuki J, Okazaki N, Hirao T, Nagata M (2016) Neural headline generation on abstract meaning representation. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, Texas, pp 1054–1059
Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 1171–1181
Wang C, Yang H, Meinel C (2018) Image captioning with deep bidirectional lstms and multi-task learning. ACM Trans Multimed Comput Commun Appl 14(2s):40:1–40:20
Google Scholar
Zajic D, Dorr BJ, Schwartz R (2004) Bbn/umd at duc-2004: topiary. In: Proceedings of the HLT-NAACL 2004 document understanding workshop, Boston, pp 112–119
Zhang X, Su J, Qin Y, Liu Y, Ji R, Wang H (2018) Asynchronous bidirectional decoding for neural machine translation. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, pp 5698–5705

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101408, China
Shuaimin Li & Jungang Xu

Authors

Shuaimin Li
View author publications
You can also search for this author inPubMed Google Scholar
Jungang Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jungang Xu.

Ethics declarations

Conflict of interest

We declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, S., Xu, J. A two-step abstractive summarization model with asynchronous and enriched-information decoding. Neural Comput & Applic 33, 1159–1170 (2021). https://doi.org/10.1007/s00521-020-05005-3

Download citation

Received: 03 October 2019
Accepted: 02 May 2020
Published: 02 June 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s00521-020-05005-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-step abstractive summarization model with asynchronous and enriched-information decoding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Abstractive Summarization Model with Adaptive Sparsemax

TEA: Topic Information based Extractive-Abstractive Fusion Model for Long Text Summary

Abstractive document summarization via multi-template decoding

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now