Abstract
Summary-level extractive summarization is often regarded as a text-matching task, which selects the summary that is semantically closest to the source document by a matching model. However, the method tends to select candidate summaries with more sentences, because it calculates the semantic similarity between the candidate summaries and the source document. To address the issue, we propose a novel set-based summary ranking strategy for extractive summarization called SeburSum, which selects a summary by examining its semantic similarity with mutually exclusive candidate summaries, rather than its similarity to the source document. In contrast to conventional extractive summarization methods that rely on well-trained extractive models trained on labeled data to select sentences, the ranking strategy eliminates the need for labeled data and enhances the versatility of both supervised and unsupervised extractive summarization. To improve the calculation accuracy of semantic similarity between candidates, we construct a contrastive learning framework with a task-specific contrastive loss to learn vector representations for each candidate summary. Experimental results show that in the supervised extractive summarization, we achieve state-of-the-art extractive performance on the CNN/Daily Mail, Reddit, and Xsum with ROUGE-1 scores of 45.49, 26.71, and 25.77, respectively, and outperform the previous summary-level baseline by 1.08, 1.62, and 0.91. In the unsupervised extractive summarization, we achieved state-of-the-art performance on the CNN/Daily Mail dataset with 42.97, 20.14 and 39.14 for ROUGE-1, ROUGE-2 and ROUGE-L, respectively, outperforming the latest state-of-the-art results by 1.71, 1.96, and 1.93.
Similar content being viewed by others
Availability of data and materials
The data and code that support the findings of this study are openly available at https://github.com/GongShuai8210/SeburSum.
References
Liu Y & Lapata M (2019) Text summarization with pretrained encoders. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp 3730–3740). Association for Computational Linguistics, Hong Kong, China. https://doi.org/10.18653/v1/D19-1387. https://aclanthology.org/D19-1387
Wang D, Liu P, Zheng Y, Qiu X, Huang, X (2020) Heterogeneous graph neural networks for extractive document summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 6209–6219. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2020.acl-main.553. https://aclanthology.org/2020.acl-main.553
Jia R, Cao Y, Fang F, Zhou Y, Fang Z, Liu Y, Wang S (2021) Deep differential amplifier for extractive summarization. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 366–376. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2021.acl-long.31. https://aclanthology.org/2021.acl-long.31
Ruan Q, Ostendorff M, Rehm G (2022) HiStruct+: Improving extractive text summarization with hierarchical structure information. In: Findings of the Association for Computational Linguistics: ACL 2022, pp 1292–1308. Association for Computational Linguistics, Dublin, Ireland. https://doi.org/10.18653/v1/2022.findings-acl.102. https://aclanthology.org/2022.findings-acl.102
Xie Q, Bishop JA, Tiwari P, Ananiadou S (2022) Pre-trained language models with domain knowledge for biomedical extractive summarization. Knowl-Based Syst 252:109460
Zhong M, Liu P, Chen Y, Wang D, Qiu X, Huang X (2020) Extractive summarization as text matching. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 6197–6208. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.552. https://aclanthology.org/2020.acl-main.552
Lin C.-Y, Hovy E (2003) Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp 150–157. https://aclanthology.org/N03-1020
Zhuang L, Wayne L, Ya S, Jun Z (2021) A robustly optimized BERT pre-training approach with post-training. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics, pp 1218–1227. Chinese Information Processing Society of China, Huhhot, China. https://aclanthology.org/2021.ccl-1.108
Liu Y, Liu P (2021) SimCLS: A simple framework for contrastive learning of abstractive summarization. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp 1065–1072. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2021.acl-short.135. https://aclanthology.org/2021.acl-short.135
Gu N, Ash E, Hahnloser R (2022) Memsum: Extractive summarization of long documents using multi-step episodic markov decision processes. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 6507–6522
Hermann KM, Kociský T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp 1693–1701. https://proceedings.neurips.cc/paper/2015/hash/afdec7005cc9f14302cd0474fd0f3c96-Abstract.html
Narayan S, Cohen SB, Lapata M (2018) Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 1797–1807. Association for Computational Linguistics, Brussels, Belgium. https://doi.org/10.18653/v1/D18-1206. https://aclanthology.org/D18-1206
Kim B, Kim H, Kim G (2019) Abstractive summarization of Reddit posts with multi-level memory networks. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 2519–2531. Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1260. https://aclanthology.org/N19-1260
Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp 1735–1742. IEEE
Mai S, Zeng Y, Zheng S, Hu H (2022) Hybrid contrastive learning of tri-modal representation for multimodal sentiment analysis. IEEE Trans Affect Comput
Chan JY-L, Bea KT, Leow SMH, Phoong SW, Cheng WK (2023) State of the art: a review of sentiment analysis based on sequential transfer learning. Artif Intell Rev 56(1):749–780
Caciularu A, Dagan I, Goldberger J, Cohan A (2022) Long context question answering via supervised contrastive learning. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 2872–2879
Zhang L, Li R (2022) Ke-gcl: Knowledge enhanced graph contrastive learning for commonsense question answering. Find Assoc Comput Linguist EMNLP 2022:76–87
Cao S, Wang L (2021) Cliff: Contrastive learning for improving faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2109.09209
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1423. https://aclanthology.org/N19-1423
Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Singh SP, Markovitch S. (eds.) Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, pp 3075–3081. AAAI Press, Palo Alto. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14636
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 654–663
Zhang X, Wei F, Zhou M (2019) HIBERT: Document level pre-training of hierarchical bidirectional transformers for document summarization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 5059–5069. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1499. https://aclanthology.org/P19-1499
Joshi A, Fidalgo E, Alegre E, Fernández-Robles L (2023) Deepsumm: exploiting topic models and sequence to sequence networks for extractive text summarization. Expert Syst Appl 211:118442
Ghadimi A, Beigy H (2023) Sgcsumm: An extractive multi-document summarization method based on pre-trained language model, submodularity, and graph convolutional neural networks. Expert Syst Appl 215:119308
Jia R, Cao Y, Tang H, Fang F, Cao C, Wang S (2020) Neural extractive summarization with hierarchical attentive heterogeneous graph network. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 3622–3631. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2020.emnlp-main.295. https://aclanthology.org/2020.emnlp-main.295
Jadhav A, Rajan V (2018) Extractive summarization with SWAP-NET: Sentences and words from alternating pointer networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 142–151. Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/P18-1014. https://aclanthology.org/P18-1014
Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp 1747–1759. Association for Computational Linguistics, New Orleans, Louisiana. https://doi.org/10.18653/v1/N18-1158. https://aclanthology.org/N18-1158
Arumae K, Liu F (2018) Reinforced extractive summarization with question-focused rewards. In: Proceedings of ACL 2018, Student Research Workshop, pp 105–111. Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/P18-3015. https://aclanthology.org/P18-3015
Luo L, Ao X, Song Y, Pan F, Yang M, He Q (2019) Reading like HER: Human reading inspired extractive summarization. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 3033–3043. Association for Computational Linguistics, Hong Kong, China. https://doi.org/10.18653/v1/D19-1300. https://aclanthology.org/D19-1300
Gu N, Ash E, Hahnloser R (2022) MemSum: Extractive summarization of long documents using multi-step episodic Markov decision processes. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 6507–6522. Association for Computational Linguistics, Dublin, Ireland. https://doi.org/10.18653/v1/2022.acl-long.450. https://aclanthology.org/2022.acl-long.450
Zheng H, Lapata M (2019) Sentence centrality revisited for unsupervised summarization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 6236–6247. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1628. https://aclanthology.org/P19-1628
Mihalcea R, Tarau P (2004) TextRank: Bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp 404–411. Association for Computational Linguistics, Barcelona, Spain. https://aclanthology.org/W04-3252
Gudakahriz, S.J, Moghadam, A.M.E, Mahmoudi, F (2022) Opinion texts summarization based on texts concepts with multi-objective pruning approach. J Supercomput, pp 1–24
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117
Xu S, Zhang X, Wu Y, Wei F, Zhou M (2020) Unsupervised extractive summarization by pre-training hierarchical transformers. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp 1784–1795. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2020.findings-emnlp.161. https://aclanthology.org/2020.findings-emnlp.161
Liang X, Wu S, Li M, Li Z (2021) Improving unsupervised extractive summarization with facet-aware modeling. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp 1685–1697. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-acl.147. https://aclanthology.org/2021.findings-acl.147
Paulus R, Xiong C, Socher R (2017) A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304
Wan, X, Cao, Z, Wei, F, Li, S, Zhou, M (2015) Multi-document summarization via discriminative summary reranking. arXiv preprint arXiv:1507.02062
Zhang D, Nan F, Wei X, Li S-W, Zhu H, McKeown K, Nallapati R, Arnold AO, Xiang B (2021) Supporting clustering with contrastive learning. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 5419–5430. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-main.427. https://aclanthology.org/2021.naacl-main.427
Gunel B, Du J, Conneau A, Stoyanov V (2020) Supervised contrastive learning for pre-trained language model fine-tuning. arXiv preprint arXiv:2011.01403
Shi J, Liang C, Hou L, Li J, Liu Z, Zhang H (2019) Deepchannel: Salience estimation by contrastive learning for extractive document summarization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 6999–7006
Wu H, Ma T, Wu L, Manyumwa T, Ji S (2020) Unsupervised reference-free summary quality evaluation via contrastive learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 3612–3621
Xu S, Zhang X, Wu Y, Wei F (2022) Sequence level contrastive learning for text summarization. In: Proceedings of the AAAI Conference on Artificial Intelligence vol 36, pp 11556–11565
An, C, Zhong, M, Wu, Z, Zhu, Q, Huang, X.-J, Qiu, X (2022) Colo: A contrastive learning based re-ranking framework for one-stage summarization. In: Proceedings of the 29th International Conference on Computational Linguistics, pp 5783–5793
Wang F, Liu H (2021) Understanding the behaviour of contrastive loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2495–2504
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp 55–60. Association for Computational Linguistics, Baltimore, Maryland. https://doi.org/10.3115/v1/P14-5010. https://aclanthology.org/P14-5010
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1073–1083. Association for Computational Linguistics, Vancouver, Canada. https://doi.org/10.18653/v1/P17-1099. https://aclanthology.org/P17-1099
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Le Scao T, Gugger S, Drame M, Lhoest Q, Rush A(2020) Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp 38–45. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-demos.6. https://aclanthology.org/2020.emnlp-demos.6
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Cortes, C, Lawrence, N, Lee, D, Sugiyama, M, Garnett, R. (eds.) Advances in neural information processing systems, vol. 28. Curran Associates, Inc, New York. https://proceedings.neurips.cc/paper/2015/file/29921001f2f04bd3baee84a12e98098f-Paper.pdf
Narayan S, Maynez J, Adamek J, Pighin D, Bratanic B, McDonald R (2020) Stepwise extractive summarization and planning with structured transformers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 4143–4159. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2020.emnlp-main.339. https://aclanthology.org/2020.emnlp-main.339
Ainslie J, Ontanon S, Alberti C, Cvicek V, Fisher Z, Pham P, Ravula A, Sanghai S, Wang Q, Yang L (2020) ETC: Encoding long and structured inputs in transformers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 268–284. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.19. https://aclanthology.org/2020.emnlp-main.19
Bi K, Jha R, Croft B, Celikyilmaz A (2021) AREDSUM: Adaptive redundancy-aware iterative sentence ranking for extractive document summarization. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp 281–291. Association for Computational Linguistics. https://aclanthology.org/2021.eacl-main.22
Jia R, Cao Y, Shi H, Fang F, Yin P, Wang S (2021) Flexible non-autoregressive extractive summarization with threshold: How to extract a non-fixed number of summary sentences. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp 13134–13142
Zhang J, Zhao Y, Saleh M, Liu P.J (2019) (1912) PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. CoRR arXiv:abs/1912.08777
Liu Y, Jia Q, Zhu K (2022) Length control in abstractive summarization by pretraining information selection. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 6885–6895. Association for Computational Linguistics, Dublin, Ireland. https://doi.org/10.18653/v1/2022.acl-long.474. https://aclanthology.org/2022.acl-long.474
Zhang S, Zhang X, Bao H, Wei F (2022) Attention temperature matters in abstractive summarization distillation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 127–141. Association for Computational Linguistics, Dublin, Ireland. https://doi.org/10.18653/v1/2022.acl-long.11. https://aclanthology.org/2022.acl-long.11
Liu Y, Liu P, Radev D, Neubig G (2022) BRIO: Bringing order to abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 2890–2903. Association for Computational Linguistics, Dublin, Ireland. https://doi.org/10.18653/v1/2022.acl-long.207. https://aclanthology.org/2022.acl-long.207
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics pp 7871–7880
Xing L, Xiao W & Carenini G (2021) Demoting the lead bias in news summarization via alternating adversarial learning. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Vol.2: Short Papers, pp 948–954. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-short.119. https://aclanthology.org/2021.acl-short.119
Chan HP, King I (2021) A condense-then-select strategy for text summarization. Knowl-Based Syst 227:107235
Acknowledgements
The work in this paper is supported by the National Social Science Foundation of China (19BYY076) and Shandong Natural Science Foundation (ZR2021MF064, ZR2021QG041). We would like to thank our teachers for their careful guidance. We also thank the members of our NLP group for their helpful discussions. We sincerely thank the volunteers for their evaluation of the summaries. Finally, we would like to thank all authors for their contributions and anonymous reviewers for their constructive comments.
Author information
Authors and Affiliations
Contributions
We list author contributions as follows. SG conceptualization, methodology, software, writing—original draft and formal analysis. ZZ validation, writing,review and editing, resources, funding acquisition, project administration. JQ investigation, writing—review and editing, project administration. WW validation, investigation, writing—review and editing. CT validation, investigation. All authors reviewed the manuscript
Corresponding authors
Ethics declarations
Ethical approval
This term is not applicable.
Conflict of interest
The authors declare that they have no known competing interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gong, S., Zhu, Z., Qi, J. et al. SeburSum: a novel set-based summary ranking strategy for summary-level extractive summarization. J Supercomput 79, 12949–12977 (2023). https://doi.org/10.1007/s11227-023-05165-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05165-8