Abstract
Current abstractive summarization models consistently outperform extractive counterparts yet are unable to close the gap with Oracle extractive upper bound. Recent research suggests that the reason lies in the lack of planning and bad sentence-wise saliency intuition. Existing solutions to the problem either require new fine-tuning sessions to accommodate the architectural changes or disrupt the natural information flow limiting the utilization of accumulated global knowledge. Inspired by text-to-image result blending techniques we propose a plug-and-play alternative that preserves the integrity of the original model, Biased Encoder Mixture. Our approach utilizes attention masking and Siamese networks to reinforce the signal of salient tokens in encoder embeddings and guide the decoder to more relevant results. The evaluation on four datasets and their respective state-of-the-art abstractive summarization models demonstrate that Biased Encoder Mixture outperforms the attention-based plug-and-play alternatives even with static masking derived from sentence saliency positional distribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
We use model weights available at https://huggingface.co/models.
- 3.
We used ‘all-mpnet-base-v2’ model available at https://www.sbert.net.
References
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880 (2020)
Zhang, J., Zhao, Y., Saleh, M., Liu, P.J.: PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11328–11339 (2020)
Liu, Y., Liu, P., Radev, D., Neubig, G.: BRIO: bringing order to abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2890–2903 (2022)
Zhang, X., et al.: Momentum calibration for text generation. arXiv:2212.04257 (2022)
Zhao, Y., Khalman, M., Joshi, R., Narayan, S., Saleh, M., Liu, P.J.: Calibrating sequence likelihood improves conditional language generation. arXiv:2210.00045 (2022)
Dou, Z.-Y., Liu, P., Hayashi, H., Jiang, Z., Neubig, G.: GSum: a general framework for guided neural abstractive summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4830–4842 (2021)
Fabbri, A., Choubey, P.K., Vig, J., Wu, C.-S., Xiong, C.: Improving factual consistency in summarization with compression-based post-editing. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 9149–9156 (2022)
Ravaut, M., Joty, S., Chen, N.: SummaReranker: a multi-task mixture-of-experts re-ranking framework for abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 4504–4524 (2022)
Wang, F., et al.: Salience allocation as guidance for abstractive summarization. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 6094–6106 (2022)
Li, H., et al.: EASE: extractive-abstractive summarization end-to-end using the information bottleneck principle. In: Proceedings of the Third Workshop on New Frontiers in Summarization, pp. 85–95 (2021)
Ji, J., Kim, Y., Glass, J., He, T.: Controlling the focus of pretrained language generation models. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 3291–3306 (2022)
Lebanoff, L., Dernoncourt, F., Kim, D.S., Chang, W., Liu, F.: A cascade approach to neural abstractive summarization with content selection and fusion. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pp. 529–535 (2020)
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with CLIP latents. arXiv:2204.06125 (2022)
Cao, S., Wang, L.: Attention head masking for inference time content selection in abstractive summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5008–5016 (2021)
Xiao, W., Miculicich, L., Liu, Y., He, P., Carenini, G.: Attend to the right context: a plug-and-play module for content-controllable summarization. arXiv:2212.10819 (2022)
Cao, S., Wang, L.: HIBRIDS: attention with hierarchical biases for structure-aware long document summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 786–807 (2022)
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)
Gehrmann, S., Deng, Y., Rush, A.: Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)
Jing, H., McKeown, K.R.: The decomposition of human-written summary sentences. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129–136 (1999)
Suhara, Y., Wang, X., Angelidis, S., Tan, W.-C.: OpinionDigest: a simple framework for opinion summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5789–5798 (2020)
Zhang, H., Liu, X., Zhang, J.: SummIt: iterative text summarization via ChatGPT. arXiv:2305.14835 (2023)
Zhang, S., Wan, D., Bansal, M.: Extractive is not faithful: an investigation of broad unfaithfulness problems in extractive summarization. arXiv:2209.03549 (2022)
Liu, Y., Lapata, M.: Text summarization with pretrained encoders. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3730–3740 (2019)
Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In: EMNLP 2018, pp. 1797–1807 (2018)
Ferrando, J., Gállego, G.I., Alastruey, B., Escolano, C., Costa-jussà, M.R.: Towards opening the black box of neural machine translation: source and target interpretations of the transformer. In: EMNLP 2022, pp. 8756–8769 (2022)
Radford, A., et al.: Learning transferable visual models from natural language supervision. arXiv:2103.00020 (2021)
Xu, J., Gan, Z., Cheng, Y., Liu, J.: Discourse-aware neural extractive model for text summarization. arXiv:1910.14142 (2019)
Gliwa, B., Mochol, I., Biesek, M., Wawer, A.: SAMSum corpus: a human-annotated dialogue dataset for abstractive summarization. In: Proceedings of the 2nd Workshop on New Frontiers in Summarization, pp. 70–79 (2019)
Grusky, M., Naaman, M., Artzi, Y.: Newsroom: a dataset of 1.3 million summaries with diverse extractive strategies. In: NAACL 2018: Human Language Technologies, vol. 1, pp. 708–719 (2018)
Acknowledgements
The work of Daniil Chernyshev (experiments, survey) was supported by Non-commercial Foundation for Support of Science and Education “INTELLECT”. The work of Boris Dobrov (general concept, interpretation of results) was is supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Ivannikov Institute for System Programming of the Russian Academy of Sciences dated November 2, 2021 No. 70-2021-00142.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chernyshev, D., Dobrov, B. (2024). Content Selection in Abstractive Summarization with Biased Encoder Mixtures. In: Ignatov, D.I., et al. Analysis of Images, Social Networks and Texts. AIST 2023. Lecture Notes in Computer Science, vol 14486. Springer, Cham. https://doi.org/10.1007/978-3-031-54534-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-54534-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54533-7
Online ISBN: 978-3-031-54534-4
eBook Packages: Computer ScienceComputer Science (R0)