Query focused summarization via relevance distillation

Yue, Ye; Li, Yuanli; Zhan, Jia-ao; Gao, Yang

doi:10.1007/s00521-023-08525-w

Query focused summarization via relevance distillation

Original Article
Published: 26 April 2023

Volume 35, pages 16543–16557, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Ye Yue¹^na1,
Yuanli Li²^na1,
Jia-ao Zhan³ &
…
Yang Gao ORCID: orcid.org/0000-0002-2422-0548³

193 Accesses
1 Citation
Explore all metrics

Abstract

Creating a short version of a concise and relevant summary regarding a specific query can broadly meet a user’s information needs in many areas. In a summarization system, the extractive technique is attractive because it is simple and fast and produces reliable outputs. Salience and relevance are two key points for the extractive summarization. The majority of existing approaches to achieving them are augmenting input features, incorporating additional attention, or expanding the training scales. Yet, there is much unsupervised but query-related knowledge needs better exploration. To this end, in this paper, we frame the query-focused document summarization as a combination of salience prediction and relevance prediction. Concretely, in addition to the oracle summary set for the salience task, we further create a pseudo-summary set regarding user-specific queries (i.e., title or image captions as the query) for the relevance task. Then, based on a modified BERT fine-tune summarization, we propose two methods, called guidance and distillation, respectively. Specifically, the guidance training essentially shares salient information to reinforce the useful contextual representations in a two-stage training with the salience-and-relevance objective. For the distillation, we propose a new “guide-student” learning paradigm that the relevance knowledge of the query is distilled and transferred from a guide model to a salience-oriented student model. Experiment results demonstrate that guidance training prevails at improving the salience of the summary and distillation training is significantly advanced at relevance learning. Both of them achieve the best state of the arts in unsupervised query-focused settings of CNN and DailyMail dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

Learning to Prompt for Vision-Language Models

Article 31 July 2022

I2DFormer+: Learning Image to Document Summary Attention for Zero-Shot Image Classification

Article 24 April 2024

Data availability

The datasets generated during the current study are available from the corresponding author on reasonable request.

Notes

https://catalog.ldc.upenn.edu/LDC2008T19.
https://github.com/shashiongithub/Refresh.
https://github.com/google-research/bert.
The ROUGE evaluation option is, -m -n 2.

References

Pugoy RA, Kao H-Y (2021) Unsupervised extractive summarization-based representations for accurate and explainable collaborative filtering. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 2981–2990
Wan X, Xiao J (2009) Graph-based multi-modality learning for topic-focused multi-document summarization. In: IJCAI, pp 1586–1591
Yih WT, Goodman J, Vanderwende L, Suzuki H (2007) Multi-document summarization by maximizing informative content-words. In: Proceedings of IJCAI’07, pp 1776–1782
Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47(2):227–237
Article Google Scholar
Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of COLING’00, pp 495–501
Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 654–663
Liu Y, Lapata M (2019) Text summarization with pretrained encoders. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 3721–3731
Nallapati R, Zhai F, Zhou B (2017) Summarunner: a recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-first AAAI conference on artificial intelligence
Zhang X, Wei F, Zhou M (2019) HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Vol. 1: Long Papers, pp 5059–5069
Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), pp 1747–1759
Cao Z, Li W, Li S, Wei F, Li Y (2016) Attsum: joint learning of focusing and summarization with neural attention. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 547–556
Narayan S, Cardenas R, Papasarantopoulos N, Cohen SB, Lapata M, Yu J, Chang Y (2018) Document modeling with external attention for sentence extraction. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 2020–2030
Ren P, Chen Z, Ren Z, Wei F, Ma J, de Rijke M (2017) Leveraging contextual sentence relations for extractive summarization using a neural attention model. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 95–104. ACM
Zhu H, Dong L, Wei F, Qin B, Liu T (2019) Transforming wikipedia into augmented data for query-focused summarization. arXiv:1911.03324
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp 4171–4186
Furlanello T, Lipton Z, Tschannen M, Itti L, Anandkumar A (2018) Born-again neural networks. In: International conference on machine learning, pp 1602–1611
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv:1802.05365
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: a lite bert for self-supervised learning of language representations. arXiv:1909.11942
Dong L, Yang N, Wang W, Wei F, Liu X, Wang Y, Gao J, Zhou M, Hon H-W (2019) Unified language model pre-training for natural language understanding and generation. arXiv preprint arXiv:1905.03197
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7871–7880
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67
MathSciNet MATH Google Scholar
Harabagiu S, Lacatusu F (2005) Topic themes for multi-document summarization. In: Proceedings of SIGIR’05, pp 202–209
Baumel T, Cohen R, Elhadad M (2016) Topic concentration in query focused summarization datasets. In: Thirtieth AAAI conference on artificial intelligence
Zhang J, Zhao Y, Saleh M, Liu P (2020) Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In: International conference on machine learning. PMLR, pp 11328–11339
Su D, Xu Y, Yu T, Siddique FB, Barezi EJ, Fung P (2020) Caire-covid: a question answering and query-focused multi-document summarization system for covid-19 scholarly information management. arXiv preprint arXiv:2005.03975
Du J, Gao Y (2021) Query-focused abstractive summarization via question-answering model. In: 2021 IEEE international conference on big knowledge (ICBK). IEEE, pp 440–447
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531
Zhang Y, Xiang T, Hospedales TM, Lu H (2018) Deep mutual learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4320–4328
Phuong M, Lampert C (2019) Towards understanding knowledge distillation. In: International conference on machine learning, pp 5142–5151
Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. Proc AAAI Conf Artif Intell 34:5191–5198
Google Scholar
He J, Gu J, Shen J, Ranzato M (2019) Revisiting self-training for neural sequence generation. arXiv:1909.13788
Liu X, He P, Chen W, Gao J (2019) Improving multi-task deep neural networks via knowledge distillation for natural language understanding. arXiv:1904.09482
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: hints for thin deep nets. arXiv:1412.6550
Polino A, Pascanu R, Alistarh D (2018) Model compression via distillation and quantization. arXiv:1802.05668
Clark K, Luong M-T, Khandelwal U, Manning CD, Le Q (2019) Bam! Born-again multi-task networks for natural language understanding. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5931–5937
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Advances in neural information processing systems, pp 1693–1701
Sandhaus E (2008) The New York times annotated corpus. Linguist Data Consort Phila 6(12):26752
Google Scholar
Durrett G, Berg-Kirkpatrick T, Klein D (2016) Learning-based single-document summarization with compression and anaphoricity constraints. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 1998–2008
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vol. 1: Long Papers, pp 1073–1083
Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Association for computational linguistics (ACL) system demonstrations, pp 55–60
Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Mihalcea R, Tarau P (2004) TextRank: bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processing, pp 404–411
Zhang X, Lapata M, Wei F, Zhou M (2018) Neural latent extractive document summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 79–784
Dong Y, Shen Y, Crawford E, van Hoof H, Cheung JCK (2018) Banditsum: extractive summarization as a contextual bandit. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 3739–3748
Bae S, Kim T, Kim J, Lee S-g (2019) Summary level training of sentence rewriting for abstractive summarization. In: Proceedings of the 2nd workshop on new frontiers in summarization, pp 10–20. Association for Computational Linguistics, Hong Kong, China
Clarke J, Lapata M (2010) Discourse constraints for document compression. Comput Linguist 36(3):411–441
Article Google Scholar

Download references

Author information

Ye Yue and Yuanli Li contributed equally to this work.

Authors and Affiliations

Science and Technology Training Center of CAS, Beijing, 100190, China
Ye Yue
Office of CPPCC DAILY, Beijing, China
Yuanli Li
School of Computer Science, Beijing Institute of Technology, Beijing, 100081, China
Jia-ao Zhan & Yang Gao

Authors

Ye Yue
View author publications
You can also search for this author in PubMed Google Scholar
Yuanli Li
View author publications
You can also search for this author in PubMed Google Scholar
Jia-ao Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Yang Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Gao.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yue, Y., Li, Y., Zhan, Ja. et al. Query focused summarization via relevance distillation. Neural Comput & Applic 35, 16543–16557 (2023). https://doi.org/10.1007/s00521-023-08525-w

Download citation

Received: 22 February 2022
Accepted: 21 March 2023
Published: 26 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00521-023-08525-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Query focused summarization via relevance distillation

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

Learning to Prompt for Vision-Language Models

I2DFormer+: Learning Image to Document Summary Attention for Zero-Shot Image Classification

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Query focused summarization via relevance distillation

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

Learning to Prompt for Vision-Language Models

I2DFormer+: Learning Image to Document Summary Attention for Zero-Shot Image Classification

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation