Abstractive document summarization via multi-template decoding

Huang, Yuxin; Yu, Zhengtao; Guo, Junjun; Xiang, Yan; Yu, Zhiqiang; Xian, Yantuan

doi:10.1007/s10489-021-02607-9

Abstractive document summarization via multi-template decoding

Published: 08 January 2022

Volume 52, pages 9650–9663, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yuxin Huang^1,2,
Zhengtao Yu^1,2,
Junjun Guo^1,2,
Yan Xiang^1,2,
Zhiqiang Yu^1,2 &
…
Yantuan Xian^1,2

559 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Most previous abstractive summarization models generate the summary in a left-to-right manner without making the most use of target-side global information. Recently, many researchers seek to alleviate this issue by retrieving target-side templates from large-scale training corpus, yet have limitations in template quality. To overcome the problem of template selection bias, one promising direction is to get better target-side global information from multiple high-quality templates. Hence, this paper extends the encoder-decoder framework by introducing a multi-template decoding mechanism, which can utilize multiple templates retrieved from the training corpus based on the semantic distance. In addition, we introduce a multi-granular attention mechanism by simultaneously taking into account the importance of words in templates and the importance of different templates. Extensive experiment results on CNN/Daily mail and English Gigaword show that our proposed model significantly outperforms several state-of-the-art abstractive and extractive baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Abstractive Document Summarization via Bidirectional Decoder

Multi-granularity Contrastive Siamese Networks for Abstractive Text Summarization

An Abstractive Summarization Method Based on Global Gated Dual Encoder

Notes

References

Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv:1607.06450
Cao Z, Li W, Li S, Wei F (2018) Retrieve, rerank and rewrite: Soft template based neural summarization. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/P18-1015. https://www.aclweb.org/anthology/P18-1015, pp 152–161
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1423. https://www.aclweb.org/anthology/N19-1423, pp 4171–4186
Elbayad M, Gu J, Grave E, Auli M (2019) Depth-adaptive transformer. In: ICLR 2020-Eighth international conference on learning representations
Fan A, Grave E, Joulin A (2019) Reducing transformer depth on demand with structured dropout. In: International conference on learning representations
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Proceedings of the 34th international conference on machine learning-Volume 70, pp 1243–1252. JMLR. org
Gehrmann S, Deng Y, Rush A (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4098–4109
Gu J, Wang Y, Cho K, Li VO (2018) Search engine guided neural machine translation. In: Thirty-second AAAI conference on artificial intelligence
Iyyer M, Manjunatha V, Boyd-Graber J, Daumé H III (2015) Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China. https://doi.org/10.3115/v1/P15-1162. https://www.aclweb.org/anthology/P15-1162, pp 1681–1691
Klein G, Kim Y, Deng Y, Nguyen V, Senellart J, Rush A (2018) OpenNMT: Neural machine translation toolkit. In: Proceedings of the 13th conference of the association for machine translation in the americas (Volume 1: Research Papers). Association for Machine Translation in the Americas, Boston, MA. https://www.aclweb.org/anthology/W18-1817, pp 177–184
Lin CY (2004) ROUGE: A package for automatic evaluation of summaries. In: Text summarization branches out. Association for Computational Linguistics, Barcelona, Spain. https://www.aclweb.org/anthology/W04-1013, pp 74–81
Liu Y, Lapata M (2019) Hierarchical transformers for multi-document summarization. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5070–5081
Lobov SA, Mikhaylov AN, Shamshin M, Makarov VA, Kazantsev VB (2020) Spatial properties of stdp in a self-learning spiking neural network enable controlling a mobile robot. Front Neurosci 0:88
Article Google Scholar
Luo L, Ao X, Song Y, Pan F, Yang M, He Q (2019) Reading like HER: Human reading inspired extractive summarization. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China. https://doi.org/10.18653/v1/D19-1300. https://www.aclweb.org/anthology/D19-1300, pp 3033–3043
Miller A, Fisch A, Dodge J, Karimi AH, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, Texas. https://doi.org/10.18653/v1/D16-1147. https://www.aclweb.org/anthology/D16-1147, pp 1400–1409
Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-first AAAI conference on artificial intelligence
Nallapati R, Zhou B, dos Santos C, Guçehre Ç, Xiang B (2016) Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: Proceedings of The 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, Germany. https://doi.org/10.18653/v1/K16-1028. https://www.aclweb.org/anthology/K16-1028, pp 280–290
Napoles C, Gormley M, Van Durme B (2012) Annotated gigaword. In: Proceedings of the joint workshop on automatic knowledge base construction and web-scale knowledge extraction, pp 95–100. Association for computational linguistics
Niu J, Sun M, Rodrigues JJ, Liu X (2019) A novel attention mechanism considering decoder input for abstractive text summarization. In: ICC 2019-2019 IEEE International conference on communications (ICC). IEEE, pp 1–7
Pandey G, Contractor D, Kumar V, Joshi S (2018) Exemplar encoder-decoder for neural conversation generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/P18-1123. https://www.aclweb.org/anthology/P18-1123, pp 1329–1338
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal. https://doi.org/10.18653/v1/D15-1044. https://www.aclweb.org/anthology/D15-1044, pp 379–389
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada. https://doi.org/10.18653/v1/P17-1099. https://www.aclweb.org/anthology/P17-1099, pp 1073–1083
Song K, Tan X, Qin T, Lu J, Liu TY (2019) Mass: Masked sequence to sequence pre-training for language generation. arXiv:1905.02450
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems 27. Curran Associates, Inc. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf, pp 3104–3112
Tu Z, Liu Y, Shi S, Zhang T (2018) Learning to remember translation history with a continuous cache. Trans Assoc Computat Linguist 0:407–420. https://www.aclweb.org/anthology/Q18-1029
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Lu K, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. Curran Associates, Inc. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf, pp 5998–6008
Wang K, Quan X, Wang R (2019) BiSET: Bi-directional selective encoding with template for abstractive summarization. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, Italy https://doi.org/10.18653/v1/P19-1207, https://www.aclweb.org/anthology/P19-1207, pp 2153–2162
Wang Y, Xia Y, Tian F, Gao F, Qin T, Zhai CX, Liu TY (2019) Neural machine translation with soft prototype. In: Advances in neural information processing systems, pp 6313–6322
Xia M, Huang G, Liu L, Shi S (2019) Graph based translation memory for neural machine translation. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 7297–7304
Xia Y, Tian F, Wu L, Lin J, Qin T, Yu N, Liu TY (2017) Deliberation networks: Sequence generation beyond one-pass decoding. In: Advances in neural information processing systems, pp 1784–1794
Xu K, Lai Y, Feng Y, Wang Z (2019) Enhancing key-value memory neural networks for knowledge based question answering. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota https://doi.org/10.18653/v1/N19-1301, https://www.aclweb.org/anthology/N19-1301, pp 2937–2947
Xu W, Li C, Lee M, Zhang C (2020) Multi-task learning for abstractive text summarization with key information guide network. EURASIP J Adv Signal Process 0:1–11
Google Scholar
Yang S, Deng B, Wang J, Li H, Lu M, Che Y, Wei X, Loparo KA (2019) Scalable digital neuromorphic architecture for large-scale biophysically meaningful neural network with multi-compartment neurons. IEEE Trans Neural Netw Learn Syst 0(1):148–162
Article Google Scholar
Yang S, Gao T, Wang J, Deng B, Lansdell B, Linares-Barranco B (2021) Efficient spike-driven learning with dendritic event-based processing. Front Neurosci 0:97
Google Scholar
Yang S, Wang J, Deng B, Liu C, Li H, Fietkiewicz C, Loparo KA (2018) Real-time neuromorphic system for large-scale conductance-based spiking neural networks. IEEE Trans Cybern 0(7):2490–2503
Article Google Scholar
Yang S, Wang J, Hao X, Li H, Wei X, Deng B, Loparo KA (2021) Bicoss: toward large-scale cognition brain with multigranular neuromorphic architecture. IEEE Transactions on Neural Networks and Learning Systems
Yang S, Wang J, Zhang N, Deng B, Pang Y, Azghadi MR (2021) Cerebellumorphic: Large-scale neuromorphic model and architecture for supervised motor learning. IEEE Transactions on Neural Networks and Learning Systems
Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2020) Dual encoding for abstractive text summarization. IEEE Trans Cybern 0(3):985–996
Article Google Scholar
Zhang J, Utiyama M, Sumita E, Neubig G, Nakamura S (2018) Guiding neural machine translation with retrieved translation pieces. In: 1325–1335. Association for Computational Linguistics, New Orleans, Louisiana https://doi.org/10.18653/v1/N18-1120, https://www.aclweb.org/anthology/N18-1120,
Zhang X, Su J, Qin Y, Liu Y, Ji R, Wang H (2018) Asynchronous bidirectional decoding for neural machine translation. In: Thirty-second AAAI conference on artificial intelligence
Zhang X, Wei F, Zhou M (2019) Hibert: Document level pre-training of hierarchical bidirectional transformers for document summarization. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5059–5069
Zhou L, Hovy E (2004) Template-filtered headline summarization. In: Text summarization branches out. Association for Computational Linguistics, Barcelona, Spain https://www.aclweb.org/anthology/W04-1010, pp 56–60
Zhou L, Zhang J, Zong C (2019) Synchronous bidirectional neural machine translation. In: Proceedings of the 2019 association for computational linguistics. Association for Computational Linguistics, Minneapolis, Minnesotahttps://www.aclweb.org/anthology/Q19-1006, pp 91–105
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for computational linguistics, Melbourne, Australia https://doi.org/10.18653/v1/P18-1061, https://www.aclweb.org/anthology/P18-1061, pp 654–663
Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada https://doi.org/10.18653/v1/P17-1101, https://www.aclweb.org/anthology/P17-1101, pp 1095–1104

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their constructive comments. This work was supported by the National Key Research and Development Program of China (Grant Nos. 2018YFC0830105, 2018YFC0830100); the National Natural Science Foundation of China (Grant Nos. 61972186, 61762056, 61472168); the Yunnan Provincial Major Science and Technology Special Plan Projects (Grant Nos. 202002AD080001); General Projects of Basic Research in Yunnan Province (Grant Nos. 202001AT070047, 202001AT070046).

Author information

Authors and Affiliations

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China
Yuxin Huang, Zhengtao Yu, Junjun Guo, Yan Xiang, Zhiqiang Yu & Yantuan Xian
Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, 650500, China
Yuxin Huang, Zhengtao Yu, Junjun Guo, Yan Xiang, Zhiqiang Yu & Yantuan Xian

Authors

Yuxin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengtao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Junjun Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yan Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yantuan Xian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhengtao Yu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Yu, Z., Guo, J. et al. Abstractive document summarization via multi-template decoding. Appl Intell 52, 9650–9663 (2022). https://doi.org/10.1007/s10489-021-02607-9

Download citation

Accepted: 09 June 2021
Published: 08 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10489-021-02607-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Abstractive document summarization via multi-template decoding

Abstract

Access this article

Similar content being viewed by others

Abstractive Document Summarization via Bidirectional Decoder

Multi-granularity Contrastive Siamese Networks for Abstractive Text Summarization

An Abstractive Summarization Method Based on Global Gated Dual Encoder

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstractive document summarization via multi-template decoding

Abstract

Access this article

Similar content being viewed by others

Abstractive Document Summarization via Bidirectional Decoder

Multi-granularity Contrastive Siamese Networks for Abstractive Text Summarization

An Abstractive Summarization Method Based on Global Gated Dual Encoder

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation