Content Selection in Abstractive Summarization with Biased Encoder Mixtures

Chernyshev, Daniil; Dobrov, Boris

doi:10.1007/978-3-031-54534-4_5

Daniil Chernyshev²⁰ &
Boris Dobrov^20,21

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14486))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

69 Accesses

Abstract

Current abstractive summarization models consistently outperform extractive counterparts yet are unable to close the gap with Oracle extractive upper bound. Recent research suggests that the reason lies in the lack of planning and bad sentence-wise saliency intuition. Existing solutions to the problem either require new fine-tuning sessions to accommodate the architectural changes or disrupt the natural information flow limiting the utilization of accumulated global knowledge. Inspired by text-to-image result blending techniques we propose a plug-and-play alternative that preserves the integrity of the original model, Biased Encoder Mixture. Our approach utilizes attention masking and Siamese networks to reinforce the signal of salient tokens in encoder embeddings and guide the decoder to more relevant results. The evaluation on four datasets and their respective state-of-the-art abstractive summarization models demonstrate that Biased Encoder Mixture outperforms the attention-based plug-and-play alternatives even with static masking derived from sentence saliency positional distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/dciresearch/BEM-ContentSelection.
2.
We use model weights available at https://huggingface.co/models.
3.
We used ‘all-mpnet-base-v2’ model available at https://www.sbert.net.

References

Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880 (2020)
Google Scholar
Zhang, J., Zhao, Y., Saleh, M., Liu, P.J.: PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11328–11339 (2020)
Google Scholar
Liu, Y., Liu, P., Radev, D., Neubig, G.: BRIO: bringing order to abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2890–2903 (2022)
Google Scholar
Zhang, X., et al.: Momentum calibration for text generation. arXiv:2212.04257 (2022)
Zhao, Y., Khalman, M., Joshi, R., Narayan, S., Saleh, M., Liu, P.J.: Calibrating sequence likelihood improves conditional language generation. arXiv:2210.00045 (2022)
Dou, Z.-Y., Liu, P., Hayashi, H., Jiang, Z., Neubig, G.: GSum: a general framework for guided neural abstractive summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4830–4842 (2021)
Google Scholar
Fabbri, A., Choubey, P.K., Vig, J., Wu, C.-S., Xiong, C.: Improving factual consistency in summarization with compression-based post-editing. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 9149–9156 (2022)
Google Scholar
Ravaut, M., Joty, S., Chen, N.: SummaReranker: a multi-task mixture-of-experts re-ranking framework for abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 4504–4524 (2022)
Google Scholar
Wang, F., et al.: Salience allocation as guidance for abstractive summarization. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 6094–6106 (2022)
Google Scholar
Li, H., et al.: EASE: extractive-abstractive summarization end-to-end using the information bottleneck principle. In: Proceedings of the Third Workshop on New Frontiers in Summarization, pp. 85–95 (2021)
Google Scholar
Ji, J., Kim, Y., Glass, J., He, T.: Controlling the focus of pretrained language generation models. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 3291–3306 (2022)
Google Scholar
Lebanoff, L., Dernoncourt, F., Kim, D.S., Chang, W., Liu, F.: A cascade approach to neural abstractive summarization with content selection and fusion. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pp. 529–535 (2020)
Google Scholar
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with CLIP latents. arXiv:2204.06125 (2022)
Cao, S., Wang, L.: Attention head masking for inference time content selection in abstractive summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5008–5016 (2021)
Google Scholar
Xiao, W., Miculicich, L., Liu, Y., He, P., Carenini, G.: Attend to the right context: a plug-and-play module for content-controllable summarization. arXiv:2212.10819 (2022)
Cao, S., Wang, L.: HIBRIDS: attention with hierarchical biases for structure-aware long document summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 786–807 (2022)
Google Scholar
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)
Google Scholar
Gehrmann, S., Deng, Y., Rush, A.: Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)
Google Scholar
Jing, H., McKeown, K.R.: The decomposition of human-written summary sentences. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129–136 (1999)
Google Scholar
Suhara, Y., Wang, X., Angelidis, S., Tan, W.-C.: OpinionDigest: a simple framework for opinion summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5789–5798 (2020)
Google Scholar
Zhang, H., Liu, X., Zhang, J.: SummIt: iterative text summarization via ChatGPT. arXiv:2305.14835 (2023)
Zhang, S., Wan, D., Bansal, M.: Extractive is not faithful: an investigation of broad unfaithfulness problems in extractive summarization. arXiv:2209.03549 (2022)
Liu, Y., Lapata, M.: Text summarization with pretrained encoders. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3730–3740 (2019)
Google Scholar
Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In: EMNLP 2018, pp. 1797–1807 (2018)
Google Scholar
Ferrando, J., Gállego, G.I., Alastruey, B., Escolano, C., Costa-jussà, M.R.: Towards opening the black box of neural machine translation: source and target interpretations of the transformer. In: EMNLP 2022, pp. 8756–8769 (2022)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. arXiv:2103.00020 (2021)
Xu, J., Gan, Z., Cheng, Y., Liu, J.: Discourse-aware neural extractive model for text summarization. arXiv:1910.14142 (2019)
Gliwa, B., Mochol, I., Biesek, M., Wawer, A.: SAMSum corpus: a human-annotated dialogue dataset for abstractive summarization. In: Proceedings of the 2nd Workshop on New Frontiers in Summarization, pp. 70–79 (2019)
Google Scholar
Grusky, M., Naaman, M., Artzi, Y.: Newsroom: a dataset of 1.3 million summaries with diverse extractive strategies. In: NAACL 2018: Human Language Technologies, vol. 1, pp. 708–719 (2018)
Google Scholar

Download references

Acknowledgements

The work of Daniil Chernyshev (experiments, survey) was supported by Non-commercial Foundation for Support of Science and Education “INTELLECT”. The work of Boris Dobrov (general concept, interpretation of results) was is supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Ivannikov Institute for System Programming of the Russian Academy of Sciences dated November 2, 2021 No. 70-2021-00142.

Author information

Authors and Affiliations

Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
Daniil Chernyshev & Boris Dobrov
ISP RAS Research Center for Trusted Artificial Intelligence, Moscow, Russia
Boris Dobrov

Authors

Daniil Chernyshev
View author publications
You can also search for this author in PubMed Google Scholar
Boris Dobrov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniil Chernyshev .

Editor information

Editors and Affiliations

National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Krasovskii Institute of Mathematics and Mechanics of Russian Academy of Sciences, Yekaterinburg, Russia
Michael Khachay
University of Oslo, Oslo, Norway
Andrey Kutuzov
American University of Armenia, Yerevan, Armenia
Habet Madoyan
Artificial Intelligence Research Institute, Moscow, Russia
Ilya Makarov
University of Hamburg, Hamburg, Germany
Irina Nikishina
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Panchenko
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Maxim Panov
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Apptek, Aachen, Germany
Evgenii Tsymbalov
Kazan Federal University, Kazan, Russia
Elena Tutubalina
MTS AI, Moscow, Russia
Sergey Zagoruyko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chernyshev, D., Dobrov, B. (2024). Content Selection in Abstractive Summarization with Biased Encoder Mixtures. In: Ignatov, D.I., et al. Analysis of Images, Social Networks and Texts. AIST 2023. Lecture Notes in Computer Science, vol 14486. Springer, Cham. https://doi.org/10.1007/978-3-031-54534-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-54534-4_5
Published: 12 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54533-7
Online ISBN: 978-3-031-54534-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Content Selection in Abstractive Summarization with Biased Encoder Mixtures