Skip to main content

Content Selection in Abstractive Summarization with Biased Encoder Mixtures

  • Conference paper
  • First Online:
Analysis of Images, Social Networks and Texts (AIST 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14486))

  • 69 Accesses

Abstract

Current abstractive summarization models consistently outperform extractive counterparts yet are unable to close the gap with Oracle extractive upper bound. Recent research suggests that the reason lies in the lack of planning and bad sentence-wise saliency intuition. Existing solutions to the problem either require new fine-tuning sessions to accommodate the architectural changes or disrupt the natural information flow limiting the utilization of accumulated global knowledge. Inspired by text-to-image result blending techniques we propose a plug-and-play alternative that preserves the integrity of the original model, Biased Encoder Mixture. Our approach utilizes attention masking and Siamese networks to reinforce the signal of salient tokens in encoder embeddings and guide the decoder to more relevant results. The evaluation on four datasets and their respective state-of-the-art abstractive summarization models demonstrate that Biased Encoder Mixture outperforms the attention-based plug-and-play alternatives even with static masking derived from sentence saliency positional distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/dciresearch/BEM-ContentSelection.

  2. 2.

    We use model weights available at https://huggingface.co/models.

  3. 3.

    We used ‘all-mpnet-base-v2’ model available at https://www.sbert.net.

References

  1. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880 (2020)

    Google Scholar 

  2. Zhang, J., Zhao, Y., Saleh, M., Liu, P.J.: PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11328–11339 (2020)

    Google Scholar 

  3. Liu, Y., Liu, P., Radev, D., Neubig, G.: BRIO: bringing order to abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2890–2903 (2022)

    Google Scholar 

  4. Zhang, X., et al.: Momentum calibration for text generation. arXiv:2212.04257 (2022)

  5. Zhao, Y., Khalman, M., Joshi, R., Narayan, S., Saleh, M., Liu, P.J.: Calibrating sequence likelihood improves conditional language generation. arXiv:2210.00045 (2022)

  6. Dou, Z.-Y., Liu, P., Hayashi, H., Jiang, Z., Neubig, G.: GSum: a general framework for guided neural abstractive summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4830–4842 (2021)

    Google Scholar 

  7. Fabbri, A., Choubey, P.K., Vig, J., Wu, C.-S., Xiong, C.: Improving factual consistency in summarization with compression-based post-editing. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 9149–9156 (2022)

    Google Scholar 

  8. Ravaut, M., Joty, S., Chen, N.: SummaReranker: a multi-task mixture-of-experts re-ranking framework for abstractive summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 4504–4524 (2022)

    Google Scholar 

  9. Wang, F., et al.: Salience allocation as guidance for abstractive summarization. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 6094–6106 (2022)

    Google Scholar 

  10. Li, H., et al.: EASE: extractive-abstractive summarization end-to-end using the information bottleneck principle. In: Proceedings of the Third Workshop on New Frontiers in Summarization, pp. 85–95 (2021)

    Google Scholar 

  11. Ji, J., Kim, Y., Glass, J., He, T.: Controlling the focus of pretrained language generation models. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 3291–3306 (2022)

    Google Scholar 

  12. Lebanoff, L., Dernoncourt, F., Kim, D.S., Chang, W., Liu, F.: A cascade approach to neural abstractive summarization with content selection and fusion. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pp. 529–535 (2020)

    Google Scholar 

  13. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with CLIP latents. arXiv:2204.06125 (2022)

  14. Cao, S., Wang, L.: Attention head masking for inference time content selection in abstractive summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5008–5016 (2021)

    Google Scholar 

  15. Xiao, W., Miculicich, L., Liu, Y., He, P., Carenini, G.: Attend to the right context: a plug-and-play module for content-controllable summarization. arXiv:2212.10819 (2022)

  16. Cao, S., Wang, L.: HIBRIDS: attention with hierarchical biases for structure-aware long document summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 786–807 (2022)

    Google Scholar 

  17. See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)

    Google Scholar 

  18. Gehrmann, S., Deng, Y., Rush, A.: Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)

    Google Scholar 

  19. Jing, H., McKeown, K.R.: The decomposition of human-written summary sentences. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129–136 (1999)

    Google Scholar 

  20. Suhara, Y., Wang, X., Angelidis, S., Tan, W.-C.: OpinionDigest: a simple framework for opinion summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5789–5798 (2020)

    Google Scholar 

  21. Zhang, H., Liu, X., Zhang, J.: SummIt: iterative text summarization via ChatGPT. arXiv:2305.14835 (2023)

  22. Zhang, S., Wan, D., Bansal, M.: Extractive is not faithful: an investigation of broad unfaithfulness problems in extractive summarization. arXiv:2209.03549 (2022)

  23. Liu, Y., Lapata, M.: Text summarization with pretrained encoders. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3730–3740 (2019)

    Google Scholar 

  24. Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In: EMNLP 2018, pp. 1797–1807 (2018)

    Google Scholar 

  25. Ferrando, J., Gállego, G.I., Alastruey, B., Escolano, C., Costa-jussà, M.R.: Towards opening the black box of neural machine translation: source and target interpretations of the transformer. In: EMNLP 2022, pp. 8756–8769 (2022)

    Google Scholar 

  26. Radford, A., et al.: Learning transferable visual models from natural language supervision. arXiv:2103.00020 (2021)

  27. Xu, J., Gan, Z., Cheng, Y., Liu, J.: Discourse-aware neural extractive model for text summarization. arXiv:1910.14142 (2019)

  28. Gliwa, B., Mochol, I., Biesek, M., Wawer, A.: SAMSum corpus: a human-annotated dialogue dataset for abstractive summarization. In: Proceedings of the 2nd Workshop on New Frontiers in Summarization, pp. 70–79 (2019)

    Google Scholar 

  29. Grusky, M., Naaman, M., Artzi, Y.: Newsroom: a dataset of 1.3 million summaries with diverse extractive strategies. In: NAACL 2018: Human Language Technologies, vol. 1, pp. 708–719 (2018)

    Google Scholar 

Download references

Acknowledgements

The work of Daniil Chernyshev (experiments, survey) was supported by Non-commercial Foundation for Support of Science and Education “INTELLECT”. The work of Boris Dobrov (general concept, interpretation of results) was is supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Ivannikov Institute for System Programming of the Russian Academy of Sciences dated November 2, 2021 No. 70-2021-00142.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniil Chernyshev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chernyshev, D., Dobrov, B. (2024). Content Selection in Abstractive Summarization with Biased Encoder Mixtures. In: Ignatov, D.I., et al. Analysis of Images, Social Networks and Texts. AIST 2023. Lecture Notes in Computer Science, vol 14486. Springer, Cham. https://doi.org/10.1007/978-3-031-54534-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54534-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54533-7

  • Online ISBN: 978-3-031-54534-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics