Skip to main content
Log in

FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion

  • S. I. : Effective and Efficient Deep Learning
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In the era of big data, machine summarization models provide a new and efficient way for the rapid processing of massive text data. Generally, whether the fact descriptions in generated summaries are consistent with input text that is a critical metric in real-world tasks. However, most existing approaches based on standard likelihood training ignore this problem and only focus on improving the ROUGE scores. In this paper, we propose a two-stage Transformer-based abstractive summarization model to improve the factual correctness, denoted as FCSF-TABS. In the first stage, we use fine-tuned BERT classifier to perform content selection to select summary-worthy single sentences or adjacent sentence pairs in the input document. In the second stage, we feed the selected sentences into the Transformer-based summarization model to generate summary sentences. Furthermore, during the training, we also introduce the idea of reinforcement learning to jointly optimize a mixed-objective loss function. Specially, to train our model, we elaborately constructed two training sets by comprehensively considering informativeness and factual consistency. We conduct a lot of experiments on the CNN/DailyMail and XSum datasets. Experimental results show that our FCSF-TABS model not only improves the ROUGE scores, but also contains fewer factual errors in the generated summaries compared to some popular summarization models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1 
Fig. 2 
Fig. 3 
Fig. 4 
Fig. 5 

Similar content being viewed by others

References

  1. Li W, Zhuge H (2019) Abstractive multi-document summarization based on semantic link network. IEEE Trans Knowl Data Eng 33(1):43–54

    Article  Google Scholar 

  2. Zhang M, Zhou G, Yu W et al (2020) FAR-ASS: fact-aware reinforced abstractive sentence summarization. Inf Process Manag. https://doi.org/10.1016/j.ipm.2020.102478

    Article  Google Scholar 

  3. Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manag 54(2):145–158

    Article  Google Scholar 

  4. Gao Y, Xu Y, Huang H et al (2019) Jointly learning topics in sentence embedding for document summarization. IEEE Trans Knowl Data Eng 32(4):688–699

    Article  Google Scholar 

  5. Mohamed M, Oussalah M (2019) SRL-ESA-TextSum: a text summarization approach based on semantic role labeling and explicit semantic analysis. Inf Process Manag 56(4):1356–1372

    Article  Google Scholar 

  6. Yulianti E, Chen RC, Scholer F et al (2018) Document summarization for answering non-factoid queries. IEEE Trans Knowl Data Eng 30(1):15–28

    Article  Google Scholar 

  7. Zhu J, Wang Q, Wang Y, Zhou Y et al. (2019) NCLS: neural cross-lingual summarization. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, 3054–3064

  8. Zheng H, Lapata M (2019) Sentence centrality revisited for unsupervised summarization. Proceedings of the 57th annual meeting of the association for computational linguistics, 6236–6247

  9. Barros C, Lloret E, Saquete E et al (2019) NATSUM: narrative abstractive summarization through cross-document timeline generation. Inf Process Manag 56(5):1775–1793

    Article  Google Scholar 

  10. Zhang X, Lapata M, Wei F et al. (2018) Neural latent extractive document summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 779–784

  11. Jadhav A, Rajan V (2018) Extractive summarization with swap-net: sentences and words from alternating pointer networks. Proceedings of the 56th annual meeting of the association for computational linguistics, 142–151

  12. Dong Y, Shen Y, Crawford E et al. (2018) Banditsum: extractive summarization as a contextual bandit. Proceedings of the 2018 conference on empirical methods in natural language processing, 3739–3748

  13. Zhang X, Lapata M, Wei F et al. (2018) Neural latent extractive document summarization. Proceedings of 2018 conference on empirical methods in natural language processing, 779–784

  14. Deng Z, Ma F, Lan R et al (2020) A two-stage chinese text summarization algorithm using keyword information and adversarial learning. Neurocomputing. https://doi.org/10.1016/j.neucom.2020.02.102

    Article  Google Scholar 

  15. Zheng J, Zhao Z, Song Z et al (2020) Abstractive meeting summarization by hierarchical adaptive segmental network learning with multiple revising steps. Neurocomputing 378:179–188

    Article  Google Scholar 

  16. Takase S, Suzuki J, Okazaki N et al. (2016) Neural headline generation on abstract meaning representation. Proceedings of the 2016 conference on empirical methods in natural language processing, 1054–1059

  17. Chen Q, Zhu XD, Ling ZH et al. (2016) Distraction-based neural networks for modeling documents. Proceedings of the 2016 International Joint Conference on Artificial Intelligence, 2754–2760

  18. Li H, Zhu J, Zhang J et al. (2020) Keywords-guided abstractive sentence summarization. Proceedings of the 2020 AAAI conference on artificial intelligence, 8196–8203

  19. Zhang Y, Merck D, Tsai E et al. (2020) Optimizing the factual correctness of a summary: a study of summarizing radiology reports. Proceedings of the 58th annual meeting of the association for computational linguistics, 5108–5120

  20. Goodrich B, Rao V, Saleh M et al. (2019) Assessing the factual accuracy of generated text. Proceedings of the 25th ACMSIGKDD international conference on knowledge discovery and data mining, 166–175

  21. Falke T, Ribeiro LF, Utama PA, Dagan I, Gurevych I (2019) Ranking generated summaries by correctness: an interesting but challenging application for natural language inference. Proceedings of the 57th annual meeting of the association for computational linguistics, 2214–2220

  22. Kryściński W, McCann B, Xiong C, Socher R (2020) Evaluating the factual consistency of abstractive text summarization. Proceedings of the 2020 conference on empirical methods in natural language processing, 9332–9346

  23. Cao Z, Wei F, Li W et al. (2017) Faithful to the original: fact aware neural abstractive summarization. Proceedings of the 31th AAAI conference on artificial intelligence, 4784–4791

  24. Angeli G, Premkumar MJ, Manning CD (2015) Leveraging linguistic structure for open domain information extraction. Proceedings of the 53th annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, 344–354

  25. Cho S, Lebanoff L, Foroosh H, Liu F (2019) Improving the similarity measure of determinantal point processes for extractive multi-document summarization. Proceedings of the 2019 annual meeting of the association for computational linguistics, 1027–1037

  26. Kedzie C, McKeown K, Daume H (2018) Content selection in deep learning models of summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 1818–1828

  27. Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. Proceedings of the 2016 annual meeting of the association for computational linguistics, 484–494

  28. Lebanoff L, Dernoncourt F, Kim DS et al. (2020) A cascade approach to neural abstractive summarization with content selection and fusion. Proceedings of the 1th conference of the Asia-Pacific chapter of the association for computational linguistics and the 10th international joint conference on natural language processing, 529–535

  29. Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, 4171–4186

  30. Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. Proceedings of the 2015 conference on empirical methods in natural language processing, 379–389

  31. Zhou Q, Yang N, Wei F et al. (2017) Selective encoding for abstractive sentence summarization. Proceedings of the 2017 annual meeting of the association for computational linguistics, 1095–1104

  32. Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 93–98

  33. Gu J, Lu Z, Li H et al. (2016) Incorporating copying mechanism in sequence-to-sequence learning. Proceedings of the 54th annual meeting of the association for computational linguistics, 1631–1640

  34. Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. Proceedings of the 2018 international conference on learning representations

  35. See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. Proceedings of the 2017 annual meeting of the association for computational linguistics, 1073–1083

  36. Zhu C, Yang Z, Gmyr R, Zeng M, Huan X (2019) Make lead bias in your favor: a simple and effective method for news summarization. arXiv preprint http://arxiv.org/abs/1912.11602

  37. Liu Y, Lapata M (2019) Text summarization with pretrained encoders. arXiv preprint http://arxiv.org/abs/1908.08345

  38. Zhang X, Wei F, Zhou M (2019) HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. Proceedings of the 57th annual meeting of the association for computational linguistics, 5059–5069

  39. Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. Proceedings of the 56th annual meeting of the association for computational linguistics, 675–686

  40. Gehrmann S, Deng Y, Rush AM (2018) Bottom-up abstractive summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 4098–4109

  41. Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. Proceedings of the 2018 annual meeting of the association for computational linguistics, 687–697

  42. Lebanoff L, Song K, Dernoncourt F et al. (2019) Scoring sentence singletons and pairs for abstractive summarization. Proceedings of the 2019 annual meeting of the association for computational linguistics, 2175–2189

  43. Maynez J, Narayan S, Bohnet B et al. (2020) On faithfulness and factuality in abstractive summarization. arXiv preprint http://arxiv.org/abs/2005.00661

  44. Gunel B, Zhu C, Zeng M, Huang X (2019) Mind the facts: knowledge boosted coherent abstractive text summarization. Proceedings of the workshop on knowledge representation and reasoning meets machine learning in NeurIPS

  45. Li H, Zhu J, Zhang J, Zong C (2018) Ensure the correctness of the summary: incorporate entailment knowledge into abstractive sentence summarization. Proceedings of the 27th international conf. on computational linguistics, 1430–1441

  46. Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21th annual international ACM SIGIR conference on research and development in information retrieval, 335–336

  47. Vaswani A, Shazeer N, Parmar N, Uszkoreit J et al. (2017) Attention is all you need. Proceedings of annual conference on neural information processing system, 5998–6008

  48. Narayan S, Cohen SB, Lapata M (2018) Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. Proceedings of the 2018 conference on empirical methods in natural language processing, 1797–1807

  49. Hermann KM, Kociský T, Grefenstette E, Espeholt L et al. (2015) Teaching machines to read and comprehend. Proceedings of annual conference on neural information processing system, 1693–1701

  50. Song K, Tan X, Qin T, Lu J, Liu TY (2019) MASS: masked sequence to sequence pre-training for language generation. Proceedings of the 36th international conference on machine learning, 5926–5936

  51. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  52. Kingma DP, Ba J (2015) Adam: a Method for Stochastic Optimization. In: Yoshua B, Yann L (Eds), 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings. http://arxiv.org/abs/1412.6980

  53. Xu S, Li H, Yuan P, Wu Y, He X, Zhou B (2020) Self-attention guided copy mechanism for abstractive summarization. Proceedings of the 58th annual meeting of the association for computational linguistics, 1355–1362

  54. Shen X, Zhao Y, Su H et al. (2019) Improving latent alignment in text summarization by generalizing the pointer generator. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, 3753–3764

  55. Raffel C, Shazeer N, Roberts A et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67

    MathSciNet  MATH  Google Scholar 

  56. Xiao D, Zhang H, Li YK, Sun Y et al. (2020) ERNIE-GEN: an enhanced multi-flow pre-training and fine-tuning framework for natural language generation. Proceedings of the 29th international joint conference on artificial intelligence, 3997–4003

  57. Lin CY (2014) Rouge: a package for automatic evaluation of summaries. Proceedings of text summarization branches out: ACL workshop, 74–81

  58. Li H, Yuan P, Xu S, Wu Y et al. (2020) Aspect-aware multimodal summarization for Chinese e-commerce products. Proceedings of the 34th conference on artificial intelligence, 8188–8195

  59. LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. Proceedings of the 28th international conference on program comprehension, 184–195

Download references

Acknowledgements

Our work was supported by Guangxi Science and Technology Foundation (2019GXNSFGA245004, 2018GXNSFAA138116), the National Natural Science Foundation of China (61862011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mengli Zhang.

Ethics declarations

Conflict of interest

There is no conflict of interests.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, M., Zhou, G., Yu, W. et al. FCSF-TABS: two-stage abstractive summarization with fact-aware reinforced content selection and fusion. Neural Comput & Applic 34, 10547–10560 (2022). https://doi.org/10.1007/s00521-021-06880-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06880-0

Keywords

Navigation