Abstract
In the era of big data, machine summarization models provide a new and efficient way for the rapid processing of massive text data. Generally, whether the fact descriptions in generated summaries are consistent with input text that is a critical metric in real-world tasks. However, most existing approaches based on standard likelihood training ignore this problem and only focus on improving the ROUGE scores. In this paper, we propose a two-stage Transformer-based abstractive summarization model to improve the factual correctness, denoted as FCSF-TABS. In the first stage, we use fine-tuned BERT classifier to perform content selection to select summary-worthy single sentences or adjacent sentence pairs in the input document. In the second stage, we feed the selected sentences into the Transformer-based summarization model to generate summary sentences. Furthermore, during the training, we also introduce the idea of reinforcement learning to jointly optimize a mixed-objective loss function. Specially, to train our model, we elaborately constructed two training sets by comprehensively considering informativeness and factual consistency. We conduct a lot of experiments on the CNN/DailyMail and XSum datasets. Experimental results show that our FCSF-TABS model not only improves the ROUGE scores, but also contains fewer factual errors in the generated summaries compared to some popular summarization models.





Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Li W, Zhuge H (2019) Abstractive multi-document summarization based on semantic link network. IEEE Trans Knowl Data Eng 33(1):43–54
Zhang M, Zhou G, Yu W et al (2020) FAR-ASS: fact-aware reinforced abstractive sentence summarization. Inf Process Manag. https://doi.org/10.1016/j.ipm.2020.102478
Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manag 54(2):145–158
Gao Y, Xu Y, Huang H et al (2019) Jointly learning topics in sentence embedding for document summarization. IEEE Trans Knowl Data Eng 32(4):688–699
Mohamed M, Oussalah M (2019) SRL-ESA-TextSum: a text summarization approach based on semantic role labeling and explicit semantic analysis. Inf Process Manag 56(4):1356–1372
Yulianti E, Chen RC, Scholer F et al (2018) Document summarization for answering non-factoid queries. IEEE Trans Knowl Data Eng 30(1):15–28
Zhu J, Wang Q, Wang Y, Zhou Y et al. (2019) NCLS: neural cross-lingual summarization. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, 3054–3064
Zheng H, Lapata M (2019) Sentence centrality revisited for unsupervised summarization. Proceedings of the 57th annual meeting of the association for computational linguistics, 6236–6247
Barros C, Lloret E, Saquete E et al (2019) NATSUM: narrative abstractive summarization through cross-document timeline generation. Inf Process Manag 56(5):1775–1793
Zhang X, Lapata M, Wei F et al. (2018) Neural latent extractive document summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 779–784
Jadhav A, Rajan V (2018) Extractive summarization with swap-net: sentences and words from alternating pointer networks. Proceedings of the 56th annual meeting of the association for computational linguistics, 142–151
Dong Y, Shen Y, Crawford E et al. (2018) Banditsum: extractive summarization as a contextual bandit. Proceedings of the 2018 conference on empirical methods in natural language processing, 3739–3748
Zhang X, Lapata M, Wei F et al. (2018) Neural latent extractive document summarization. Proceedings of 2018 conference on empirical methods in natural language processing, 779–784
Deng Z, Ma F, Lan R et al (2020) A two-stage chinese text summarization algorithm using keyword information and adversarial learning. Neurocomputing. https://doi.org/10.1016/j.neucom.2020.02.102
Zheng J, Zhao Z, Song Z et al (2020) Abstractive meeting summarization by hierarchical adaptive segmental network learning with multiple revising steps. Neurocomputing 378:179–188
Takase S, Suzuki J, Okazaki N et al. (2016) Neural headline generation on abstract meaning representation. Proceedings of the 2016 conference on empirical methods in natural language processing, 1054–1059
Chen Q, Zhu XD, Ling ZH et al. (2016) Distraction-based neural networks for modeling documents. Proceedings of the 2016 International Joint Conference on Artificial Intelligence, 2754–2760
Li H, Zhu J, Zhang J et al. (2020) Keywords-guided abstractive sentence summarization. Proceedings of the 2020 AAAI conference on artificial intelligence, 8196–8203
Zhang Y, Merck D, Tsai E et al. (2020) Optimizing the factual correctness of a summary: a study of summarizing radiology reports. Proceedings of the 58th annual meeting of the association for computational linguistics, 5108–5120
Goodrich B, Rao V, Saleh M et al. (2019) Assessing the factual accuracy of generated text. Proceedings of the 25th ACMSIGKDD international conference on knowledge discovery and data mining, 166–175
Falke T, Ribeiro LF, Utama PA, Dagan I, Gurevych I (2019) Ranking generated summaries by correctness: an interesting but challenging application for natural language inference. Proceedings of the 57th annual meeting of the association for computational linguistics, 2214–2220
Kryściński W, McCann B, Xiong C, Socher R (2020) Evaluating the factual consistency of abstractive text summarization. Proceedings of the 2020 conference on empirical methods in natural language processing, 9332–9346
Cao Z, Wei F, Li W et al. (2017) Faithful to the original: fact aware neural abstractive summarization. Proceedings of the 31th AAAI conference on artificial intelligence, 4784–4791
Angeli G, Premkumar MJ, Manning CD (2015) Leveraging linguistic structure for open domain information extraction. Proceedings of the 53th annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, 344–354
Cho S, Lebanoff L, Foroosh H, Liu F (2019) Improving the similarity measure of determinantal point processes for extractive multi-document summarization. Proceedings of the 2019 annual meeting of the association for computational linguistics, 1027–1037
Kedzie C, McKeown K, Daume H (2018) Content selection in deep learning models of summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 1818–1828
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. Proceedings of the 2016 annual meeting of the association for computational linguistics, 484–494
Lebanoff L, Dernoncourt F, Kim DS et al. (2020) A cascade approach to neural abstractive summarization with content selection and fusion. Proceedings of the 1th conference of the Asia-Pacific chapter of the association for computational linguistics and the 10th international joint conference on natural language processing, 529–535
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, 4171–4186
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. Proceedings of the 2015 conference on empirical methods in natural language processing, 379–389
Zhou Q, Yang N, Wei F et al. (2017) Selective encoding for abstractive sentence summarization. Proceedings of the 2017 annual meeting of the association for computational linguistics, 1095–1104
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 93–98
Gu J, Lu Z, Li H et al. (2016) Incorporating copying mechanism in sequence-to-sequence learning. Proceedings of the 54th annual meeting of the association for computational linguistics, 1631–1640
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. Proceedings of the 2018 international conference on learning representations
See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. Proceedings of the 2017 annual meeting of the association for computational linguistics, 1073–1083
Zhu C, Yang Z, Gmyr R, Zeng M, Huan X (2019) Make lead bias in your favor: a simple and effective method for news summarization. arXiv preprint http://arxiv.org/abs/1912.11602
Liu Y, Lapata M (2019) Text summarization with pretrained encoders. arXiv preprint http://arxiv.org/abs/1908.08345
Zhang X, Wei F, Zhou M (2019) HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. Proceedings of the 57th annual meeting of the association for computational linguistics, 5059–5069
Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. Proceedings of the 56th annual meeting of the association for computational linguistics, 675–686
Gehrmann S, Deng Y, Rush AM (2018) Bottom-up abstractive summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 4098–4109
Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. Proceedings of the 2018 annual meeting of the association for computational linguistics, 687–697
Lebanoff L, Song K, Dernoncourt F et al. (2019) Scoring sentence singletons and pairs for abstractive summarization. Proceedings of the 2019 annual meeting of the association for computational linguistics, 2175–2189
Maynez J, Narayan S, Bohnet B et al. (2020) On faithfulness and factuality in abstractive summarization. arXiv preprint http://arxiv.org/abs/2005.00661
Gunel B, Zhu C, Zeng M, Huang X (2019) Mind the facts: knowledge boosted coherent abstractive text summarization. Proceedings of the workshop on knowledge representation and reasoning meets machine learning in NeurIPS
Li H, Zhu J, Zhang J, Zong C (2018) Ensure the correctness of the summary: incorporate entailment knowledge into abstractive sentence summarization. Proceedings of the 27th international conf. on computational linguistics, 1430–1441
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21th annual international ACM SIGIR conference on research and development in information retrieval, 335–336
Vaswani A, Shazeer N, Parmar N, Uszkoreit J et al. (2017) Attention is all you need. Proceedings of annual conference on neural information processing system, 5998–6008
Narayan S, Cohen SB, Lapata M (2018) Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 1797–1807
Hermann KM, Kociský T, Grefenstette E, Espeholt L et al. (2015) Teaching machines to read and comprehend. Proceedings of annual conference on neural information processing system, 1693–1701
Song K, Tan X, Qin T, Lu J, Liu TY (2019) MASS: masked sequence to sequence pre-training for language generation. Proceedings of the 36th international conference on machine learning, 5926–5936
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Kingma DP, Ba J (2015) Adam: a Method for Stochastic Optimization. In: Yoshua B, Yann L (Eds), 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings. http://arxiv.org/abs/1412.6980
Xu S, Li H, Yuan P, Wu Y, He X, Zhou B (2020) Self-attention guided copy mechanism for abstractive summarization. Proceedings of the 58th annual meeting of the association for computational linguistics, 1355–1362
Shen X, Zhao Y, Su H et al. (2019) Improving latent alignment in text summarization by generalizing the pointer generator. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, 3753–3764
Raffel C, Shazeer N, Roberts A et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67
Xiao D, Zhang H, Li YK, Sun Y et al. (2020) ERNIE-GEN: an enhanced multi-flow pre-training and fine-tuning framework for natural language generation. Proceedings of the 29th international joint conference on artificial intelligence, 3997–4003
Lin CY (2014) Rouge: a package for automatic evaluation of summaries. Proceedings of text summarization branches out: ACL workshop, 74–81
Li H, Yuan P, Xu S, Wu Y et al. (2020) Aspect-aware multimodal summarization for Chinese e-commerce products. Proceedings of the 34th conference on artificial intelligence, 8188–8195
LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. Proceedings of the 28th international conference on program comprehension, 184–195
Acknowledgements
Our work was supported by Guangxi Science and Technology Foundation (2019GXNSFGA245004, 2018GXNSFAA138116), the National Natural Science Foundation of China (61862011).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interests.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, M., Zhou, G., Yu, W. et al. FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion. Neural Comput & Applic 34, 10547–10560 (2022). https://doi.org/10.1007/s00521-021-06880-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06880-0