FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion

Zhang, Mengli; Zhou, Gang; Yu, Wanting; Liu, Wenfen; Huang, Ningbo; Yu, Ze

doi:10.1007/s00521-021-06880-0

FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion

S. I. : Effective and Efficient Deep Learning
Published: 30 January 2022

Volume 34, pages 10547–10560, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Mengli Zhang ORCID: orcid.org/0000-0002-7411-2561¹,
Gang Zhou¹,
Wanting Yu¹,
Wenfen Liu²,
Ningbo Huang¹ &
…
Ze Yu¹

463 Accesses
1 Altmetric
Explore all metrics

Abstract

In the era of big data, machine summarization models provide a new and efficient way for the rapid processing of massive text data. Generally, whether the fact descriptions in generated summaries are consistent with input text that is a critical metric in real-world tasks. However, most existing approaches based on standard likelihood training ignore this problem and only focus on improving the ROUGE scores. In this paper, we propose a two-stage Transformer-based abstractive summarization model to improve the factual correctness, denoted as FCSF-TABS. In the first stage, we use fine-tuned BERT classifier to perform content selection to select summary-worthy single sentences or adjacent sentence pairs in the input document. In the second stage, we feed the selected sentences into the Transformer-based summarization model to generate summary sentences. Furthermore, during the training, we also introduce the idea of reinforcement learning to jointly optimize a mixed-objective loss function. Specially, to train our model, we elaborately constructed two training sets by comprehensively considering informativeness and factual consistency. We conduct a lot of experiments on the CNN/DailyMail and XSum datasets. Experimental results show that our FCSF-TABS model not only improves the ROUGE scores, but also contains fewer factual errors in the generated summaries compared to some popular summarization models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Few Good Sentences: Content Selection for Abstractive Text Summarization

Cl2sum: abstractive summarization via contrastive prompt constructed by LLMs hallucination

Article Open access 19 February 2025

Learning to Consider Relevance and Redundancy Dynamically for Abstractive Multi-document Summarization

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Li W, Zhuge H (2019) Abstractive multi-document summarization based on semantic link network. IEEE Trans Knowl Data Eng 33(1):43–54
Article Google Scholar
Zhang M, Zhou G, Yu W et al (2020) FAR-ASS: fact-aware reinforced abstractive sentence summarization. Inf Process Manag. https://doi.org/10.1016/j.ipm.2020.102478
Article Google Scholar
Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manag 54(2):145–158
Article Google Scholar
Gao Y, Xu Y, Huang H et al (2019) Jointly learning topics in sentence embedding for document summarization. IEEE Trans Knowl Data Eng 32(4):688–699
Article Google Scholar
Mohamed M, Oussalah M (2019) SRL-ESA-TextSum: a text summarization approach based on semantic role labeling and explicit semantic analysis. Inf Process Manag 56(4):1356–1372
Article Google Scholar
Yulianti E, Chen RC, Scholer F et al (2018) Document summarization for answering non-factoid queries. IEEE Trans Knowl Data Eng 30(1):15–28
Article Google Scholar
Zhu J, Wang Q, Wang Y, Zhou Y et al. (2019) NCLS: neural cross-lingual summarization. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, 3054–3064
Zheng H, Lapata M (2019) Sentence centrality revisited for unsupervised summarization. Proceedings of the 57th annual meeting of the association for computational linguistics, 6236–6247
Barros C, Lloret E, Saquete E et al (2019) NATSUM: narrative abstractive summarization through cross-document timeline generation. Inf Process Manag 56(5):1775–1793
Article Google Scholar
Zhang X, Lapata M, Wei F et al. (2018) Neural latent extractive document summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 779–784
Jadhav A, Rajan V (2018) Extractive summarization with swap-net: sentences and words from alternating pointer networks. Proceedings of the 56th annual meeting of the association for computational linguistics, 142–151
Dong Y, Shen Y, Crawford E et al. (2018) Banditsum: extractive summarization as a contextual bandit. Proceedings of the 2018 conference on empirical methods in natural language processing, 3739–3748
Zhang X, Lapata M, Wei F et al. (2018) Neural latent extractive document summarization. Proceedings of 2018 conference on empirical methods in natural language processing, 779–784
Deng Z, Ma F, Lan R et al (2020) A two-stage chinese text summarization algorithm using keyword information and adversarial learning. Neurocomputing. https://doi.org/10.1016/j.neucom.2020.02.102
Article Google Scholar
Zheng J, Zhao Z, Song Z et al (2020) Abstractive meeting summarization by hierarchical adaptive segmental network learning with multiple revising steps. Neurocomputing 378:179–188
Article Google Scholar
Takase S, Suzuki J, Okazaki N et al. (2016) Neural headline generation on abstract meaning representation. Proceedings of the 2016 conference on empirical methods in natural language processing, 1054–1059
Chen Q, Zhu XD, Ling ZH et al. (2016) Distraction-based neural networks for modeling documents. Proceedings of the 2016 International Joint Conference on Artificial Intelligence, 2754–2760
Li H, Zhu J, Zhang J et al. (2020) Keywords-guided abstractive sentence summarization. Proceedings of the 2020 AAAI conference on artificial intelligence, 8196–8203
Zhang Y, Merck D, Tsai E et al. (2020) Optimizing the factual correctness of a summary: a study of summarizing radiology reports. Proceedings of the 58th annual meeting of the association for computational linguistics, 5108–5120
Goodrich B, Rao V, Saleh M et al. (2019) Assessing the factual accuracy of generated text. Proceedings of the 25th ACMSIGKDD international conference on knowledge discovery and data mining, 166–175
Falke T, Ribeiro LF, Utama PA, Dagan I, Gurevych I (2019) Ranking generated summaries by correctness: an interesting but challenging application for natural language inference. Proceedings of the 57th annual meeting of the association for computational linguistics, 2214–2220
Kryściński W, McCann B, Xiong C, Socher R (2020) Evaluating the factual consistency of abstractive text summarization. Proceedings of the 2020 conference on empirical methods in natural language processing, 9332–9346
Cao Z, Wei F, Li W et al. (2017) Faithful to the original: fact aware neural abstractive summarization. Proceedings of the 31th AAAI conference on artificial intelligence, 4784–4791
Angeli G, Premkumar MJ, Manning CD (2015) Leveraging linguistic structure for open domain information extraction. Proceedings of the 53th annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, 344–354
Cho S, Lebanoff L, Foroosh H, Liu F (2019) Improving the similarity measure of determinantal point processes for extractive multi-document summarization. Proceedings of the 2019 annual meeting of the association for computational linguistics, 1027–1037
Kedzie C, McKeown K, Daume H (2018) Content selection in deep learning models of summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 1818–1828
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. Proceedings of the 2016 annual meeting of the association for computational linguistics, 484–494
Lebanoff L, Dernoncourt F, Kim DS et al. (2020) A cascade approach to neural abstractive summarization with content selection and fusion. Proceedings of the 1th conference of the Asia-Pacific chapter of the association for computational linguistics and the 10th international joint conference on natural language processing, 529–535
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, 4171–4186
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. Proceedings of the 2015 conference on empirical methods in natural language processing, 379–389
Zhou Q, Yang N, Wei F et al. (2017) Selective encoding for abstractive sentence summarization. Proceedings of the 2017 annual meeting of the association for computational linguistics, 1095–1104
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 93–98
Gu J, Lu Z, Li H et al. (2016) Incorporating copying mechanism in sequence-to-sequence learning. Proceedings of the 54th annual meeting of the association for computational linguistics, 1631–1640
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. Proceedings of the 2018 international conference on learning representations
See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. Proceedings of the 2017 annual meeting of the association for computational linguistics, 1073–1083
Zhu C, Yang Z, Gmyr R, Zeng M, Huan X (2019) Make lead bias in your favor: a simple and effective method for news summarization. arXiv preprint http://arxiv.org/abs/1912.11602
Liu Y, Lapata M (2019) Text summarization with pretrained encoders. arXiv preprint http://arxiv.org/abs/1908.08345
Zhang X, Wei F, Zhou M (2019) HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. Proceedings of the 57th annual meeting of the association for computational linguistics, 5059–5069
Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. Proceedings of the 56th annual meeting of the association for computational linguistics, 675–686
Gehrmann S, Deng Y, Rush AM (2018) Bottom-up abstractive summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 4098–4109
Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. Proceedings of the 2018 annual meeting of the association for computational linguistics, 687–697
Lebanoff L, Song K, Dernoncourt F et al. (2019) Scoring sentence singletons and pairs for abstractive summarization. Proceedings of the 2019 annual meeting of the association for computational linguistics, 2175–2189
Maynez J, Narayan S, Bohnet B et al. (2020) On faithfulness and factuality in abstractive summarization. arXiv preprint http://arxiv.org/abs/2005.00661
Gunel B, Zhu C, Zeng M, Huang X (2019) Mind the facts: knowledge boosted coherent abstractive text summarization. Proceedings of the workshop on knowledge representation and reasoning meets machine learning in NeurIPS
Li H, Zhu J, Zhang J, Zong C (2018) Ensure the correctness of the summary: incorporate entailment knowledge into abstractive sentence summarization. Proceedings of the 27th international conf. on computational linguistics, 1430–1441
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21th annual international ACM SIGIR conference on research and development in information retrieval, 335–336
Vaswani A, Shazeer N, Parmar N, Uszkoreit J et al. (2017) Attention is all you need. Proceedings of annual conference on neural information processing system, 5998–6008
Narayan S, Cohen SB, Lapata M (2018) Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 1797–1807
Hermann KM, Kociský T, Grefenstette E, Espeholt L et al. (2015) Teaching machines to read and comprehend. Proceedings of annual conference on neural information processing system, 1693–1701
Song K, Tan X, Qin T, Lu J, Liu TY (2019) MASS: masked sequence to sequence pre-training for language generation. Proceedings of the 36th international conference on machine learning, 5926–5936
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Kingma DP, Ba J (2015) Adam: a Method for Stochastic Optimization. In: Yoshua B, Yann L (Eds), 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings. http://arxiv.org/abs/1412.6980
Xu S, Li H, Yuan P, Wu Y, He X, Zhou B (2020) Self-attention guided copy mechanism for abstractive summarization. Proceedings of the 58th annual meeting of the association for computational linguistics, 1355–1362
Shen X, Zhao Y, Su H et al. (2019) Improving latent alignment in text summarization by generalizing the pointer generator. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, 3753–3764
Raffel C, Shazeer N, Roberts A et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67
MathSciNet MATH Google Scholar
Xiao D, Zhang H, Li YK, Sun Y et al. (2020) ERNIE-GEN: an enhanced multi-flow pre-training and fine-tuning framework for natural language generation. Proceedings of the 29th international joint conference on artificial intelligence, 3997–4003
Lin CY (2014) Rouge: a package for automatic evaluation of summaries. Proceedings of text summarization branches out: ACL workshop, 74–81
Li H, Yuan P, Xu S, Wu Y et al. (2020) Aspect-aware multimodal summarization for Chinese e-commerce products. Proceedings of the 34th conference on artificial intelligence, 8188–8195
LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. Proceedings of the 28th international conference on program comprehension, 184–195

Download references

Acknowledgements

Our work was supported by Guangxi Science and Technology Foundation (2019GXNSFGA245004, 2018GXNSFAA138116), the National Natural Science Foundation of China (61862011).

Author information

Authors and Affiliations

State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China
Mengli Zhang, Gang Zhou, Wanting Yu, Ningbo Huang & Ze Yu
Guilin University of Electronic Technology, Guilin, China
Wenfen Liu

Authors

Mengli Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Gang Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Wanting Yu
View author publications
You can also search for this author inPubMed Google Scholar
Wenfen Liu
View author publications
You can also search for this author inPubMed Google Scholar
Ningbo Huang
View author publications
You can also search for this author inPubMed Google Scholar
Ze Yu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Mengli Zhang.

Ethics declarations

Conflict of interest

There is no conflict of interests.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, M., Zhou, G., Yu, W. et al. FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion. Neural Comput & Applic 34, 10547–10560 (2022). https://doi.org/10.1007/s00521-021-06880-0

Download citation

Received: 04 March 2021
Accepted: 19 December 2021
Published: 30 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s00521-021-06880-0

Keywords

Part of a collection:

S.I.: Effective and Efficient Deep Learning-Based Solutions (vol 34, issue 13)

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Few Good Sentences: Content Selection for Abstractive Text Summarization

Cl2sum: abstractive summarization via contrastive prompt constructed by LLMs hallucination

Learning to Consider Relevance and Redundancy Dynamically for Abstractive Multi-document Summarization

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now