Skip to main content
Log in

A two-step abstractive summarization model with asynchronous and enriched-information decoding

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Most sequence-to-sequence abstractive summarization models generate the summaries based on the source article and the generated words, but they often neglect the future information implied in the un-generated words, which means that they lack the ability of “lookahead.” In this paper, we present a novel summarization model with “lookahead” ability to fully employ the implied future information. Our model takes two steps: (1) in the first step, an asynchronous decoder model with a no ground truth guiding backward decoder that explicitly produces and exploits the future information is trained. (2) in the inference process, in addition to the joint probability of the generated sequence, an enriched-information decoding method is proposed to further take future ROUGE reward of the un-generated words into account. Furthermore, the future ROUGE reward is predicted by a novel reward-predict model, and it takes the hidden states of the pre-trained asynchronous decoder model as input. Experimental results show that our two-step summarization model achieves new state-of-the-art results on CNN/Daily Mail dataset and the generalization of our model on test-only DUC-2002 datasets achieves higher scores than the state-of-the-art model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://www-nlpir.nist.gov/projects/duc/guidelines/2002.html.

  2. pypi.org/project/pyrouge/0.1.3

References

  1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Preprint. arXiv:abs/1409.0473

  2. Banko M, Mittal VO, Witbrock MJ (2000) Headline generation based on statistical translation. In: Proceedings of the 38th annual meeting of the association for computational linguistics, pp. 318–325. Association for Computational Linguistics, Hong Kong

  3. Çelikyilmaz A, Bosselut A, He X, Choi Y (2018) Deep communicating agents for abstractive summarization. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 1: long papers, pp 1662–1675

  4. Chen Y, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: long papers, pp 675–686

  5. Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 93–98. Association for Computational Linguistics, San Diego, California

  6. Clarke J, Lapata M (2008) Global inference for sentence compression an integer linear programming approach. J Artif Intell Res 31(1):399–429

    Article  Google Scholar 

  7. Denkowski M, Lavie A (2014) Meteor universal: Language specific translation evaluation for any target language. In: Proceedings of the ninth workshop on statistical machine translation, pp 376–380. Association for Computational Linguistics, Baltimore, Maryland, USA

  8. Filippova K, Alfonseca E, Colmenares CA, Kaiser L, Vinyals O (2015) Sentence compression by deletion with LSTMs. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 360–368. Association for Computational Linguistics, Lisbon, Portugal

  9. Gehrmann S, Deng Y, Rush A (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4098–4109. Association for Computational Linguistics, Brussels, Belgium

  10. Gu J, Lu Z, Li H, Li VOK (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: long papers

  11. Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: long papers, pp 687–697

  12. He D, Lu H, Xia Y, Qin T, Wang L, Liu TY (2017) Decoding with value networks for neural machine translation. Adv Neural Inf Process Syst 30:178–187

    Google Scholar 

  13. Henß S, Mieskes M, Gurevych I (2015) A reinforcement learning approach for adaptive single- and multi-document summarization. In: Proceedings of the international conference of the german society for computational linguistics and language technology, GSCL 2015, University of Duisburg-Essen, Germany, pp 3–12

  14. Hoang CDV, Haffari G, Cohn T (2017) Towards decoding as continuous optimisation in neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 146–156. Association for Computational Linguistics, Copenhagen, Denmark

  15. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  16. Jing H, McKeown KR (2000) Cut and paste based text summarization. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference, NAACL 2000, pp 178–185. Association for Computational Linguistics, Stroudsburg, PA, USA

  17. Knight K, Marcu D (2002) Summarization beyond sentence extraction: a probabilistic approach to sentence compression. J Artif Intell Res 139(1):91–107

    Article  Google Scholar 

  18. Li C, Xu W, Li S, Gao S (2018) Guiding generation for abstractive text summarization based on key information guide network. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 2 (short papers), pp 55–60. Association for Computational Linguistics, New Orleans, Louisiana

  19. Li W, Xiao X, Lyu Y, Wang Y (2018) Improving neural abstractive document summarization with explicit information selection modeling. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 1787–1796. Association for Computational Linguistics, Brussels, Belgium

  20. Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Text summarization branches out: Proceedings of the ACL-04 workshop, pp 74–81. Association for Computational Linguistics, Barcelona, Spain

  21. Liu CW, Lowe R, Serban I, Noseworthy M, Charlin L, Pineau J (2016) How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2122–2132. Association for Computational Linguistics, Austin, Texas

  22. Nallapati R, Zhou B, dos Santos CN, Gülçehre Ç, Xiang B (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, CoNLL 2016, Berlin, Germany, August 11–12, 2016, pp 280–290

  23. Pasunuru R, Bansal M (2018) Multi-reward reinforced summarization with saliency and entailment. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (short papers), pp 646–653

  24. Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: Proceedings of 6th international conference on learning representations, Vancouver, BC, Canada

  25. Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 379–389

  26. See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, pp 1073–1083

  27. Sennrich R, Haddow B, Birch A (2016) Edinburgh neural machine translation systems for WMT 16. In: Proceedings of the first conference on machine translation, WMT 2016, colocated with ACL 2016, August 11–12, Berlin, Germany, pp 371–376

  28. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap TP, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489

    Article  Google Scholar 

  29. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 3104–3112

  30. Takase S, Suzuki J, Okazaki N, Hirao T, Nagata M (2016) Neural headline generation on abstract meaning representation. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, Texas, pp 1054–1059

  31. Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 1171–1181

  32. Wang C, Yang H, Meinel C (2018) Image captioning with deep bidirectional lstms and multi-task learning. ACM Trans Multimed Comput Commun Appl 14(2s):40:1–40:20

    Google Scholar 

  33. Zajic D, Dorr BJ, Schwartz R (2004) Bbn/umd at duc-2004: topiary. In: Proceedings of the HLT-NAACL 2004 document understanding workshop, Boston, pp 112–119

  34. Zhang X, Su J, Qin Y, Liu Y, Ji R, Wang H (2018) Asynchronous bidirectional decoding for neural machine translation. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, pp 5698–5705

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jungang Xu.

Ethics declarations

Conflict of interest

We declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Xu, J. A two-step abstractive summarization model with asynchronous and enriched-information decoding. Neural Comput & Applic 33, 1159–1170 (2021). https://doi.org/10.1007/s00521-020-05005-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05005-3

Keywords

Navigation