Abstract
Multi-document summarization (MDS) has attracted increasing attention in recent years. Most existing MDS systems simply encode the flat connected sequence of multiple documents, which limits the representation capabilities for multi-documents. To address this issue, we propose a Hierarchical Multi-Document Summarization Model with Global-Local Document Dependencies (HierMDS). HierMDS consists of five sub-blocks, i.e., an embedding block, an internal document encoding block, a local document encoding block, a global document encoding block, and a fusion block, which are stacked in a hierarchical structure to gradually produce dependency-enriched document representations. Specifically, the embedding block encodes tokens, and the internal document encoding block encodes each document. Then, for a certain document, two kinds of document dependencies are extracted: (1) The global document dependency indicates that the representation of this document is affected by all the other documents. (2) The local document dependency indicates that the representation of this document is only affected by the relevant documents. We suppose that the global document dependency represents the global background information, while the local document dependency condenses the most relevant information. To be specific, the global document encoding block modeled with the vanilla transformer layer encodes the global document dependencies, and the local document encoding block modeled with the graph attention neural networks encodes the local document dependencies. Finally, HierMDS produces document dependency-enriched representations by fusing the local and global document dependencies with the fusion block. Experimental results on Multi-News and DUC-2004 datasets have demonstrated competitive advantages of HierMDS compared with several state-of-the-art MDS models.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available from the authors on reasonable request.
Change history
10 May 2024
A Correction to this paper has been published: https://doi.org/10.1007/s00521-024-09842-4
Notes
Task 1 of SemEval-2017.
References
Hu Y, Chen Y, Chou H (2017) Opinion mining from online hotel reviews - a text summarization approach. Inf Process Manag 53(2):436–449
Paulus R, Xiong C, Socher R (2018) A Deep Reinforced Model for Abstractive Summarization. In: Proceedings of 6th international conference on learning representations, Vancouver, BC, Canada
Baralis E, Cagliero L, Jabeen S, Fiori A (2012) Multi-document summarization exploiting frequent itemsets. In: Proceedings of the ACM symposium on applied computing, SAC 2012, Riva, Trento, Italy, March 26-30, 2012, pp 782–786
Xu Y, Zhang X, Quan G, Wang Y (2013) MRS for multi-document summarization by sentence extraction. Telecommun Syst 53(1):91–98
Mani I, Bloedorn E (1997) Multi-document summarization by graph search and matching. In: Proceedings of the fourteenth national conference on artificial intelligence and ninth innovative applications of artificial intelligence conference, AAAI 97, IAAI 97, July 27-31, 1997, Providence, Rhode Island, USA, pp 622–628
Hu P, He J, Zhang Y (2015) Graph-based query-focused multi-document summarization using improved affinity graph. In: knowledge science, engineering and management - 8th international conference, KSEM 2015, Chongqing, China, October 28-30, 2015, Proceedings. Lecture Notes in Computer Science, vol. 9403, pp 336–347
Ma C, Zhang WE, Guo M, Wang H, Sheng QZ (2023) Multi-document summarization via deep learning techniques: a survey. ACM Comput Surv 55(5):102–110237
Li P, Lam W, Bing L, Guo W, Li H (2017) Cascaded attention based unsupervised information distillation for compressive summarization. In: Proceedings of the 2017 conference on empirical methods in natural language processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pp 2081–2090
Cao Z, Li W, Li S, Wei F (2017) Improving multi-document summarization via text classification. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4-9, 2017, San Francisco, California, USA
Cao Z, Wei F, Dong L, Li S, Zhou M (2015) Ranking with recursive neural networks and its application to multi-document summarization. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, January 25-30, 2015, Austin, Texas, USA, pp 2153–2159
Liu PJ, Saleh M, Pot E, Goodrich B, Sepassi R, Kaiser L, Shazeer N (2018) Generating Wikipedia by Summarizing Long Sequences. In: 6th international conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings
Fabbri AR, Li I, She T, Li S, Radev DR (2019) Multi-News: a large-scale multi-document summarization dataset and abstractive hierarchical model. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 1074–1084
See A, Liu PJ, Manning CD (2017) Get To The point: summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1073–1083
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You Need. In: advances in neural information processing systems 30: annual conference on neural Information processing systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp 5998–6008
Carbonell JG, Goldstein J (1998) The Use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, August 24-28 1998, Melbourne, Australia, pp 335–336
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, pp 7871–7880
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:140–114067
Zhang J, Zhao Y, Saleh M, Liu PJ (2020) PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13-18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp 11328–11339
Jin H, Wang T, Wan X (2020) Multi-granularity interaction network for extractive and abstractive multi-document summarization. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, pp 6244–6254
Li W, Xiao X, Liu J, Wu H, Wang H, Du J (2020) Leveraging graph to improve abstractive multi-document summarization. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, pp. 6232–6243
Liu Y, Lapata M (2019) Hierarchical transformers for multi-document summarization. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 5070–5081
Ren P, Chen Z, Ren Z, Wei F, Ma J, de Rijke M (2017) Leveraging contextual sentence relations for extractive summarization using a neural attention model. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017, pp 95–104
Pasunuru R, Liu M, Bansal M, Ravi S, Dreyer M (2021) Efficiently summarizing text and graph encodings of multi-document clusters. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pp 4768–4779
Grail Q, Perez J, Gaussier É (2021) Globalizing bert-based transformer architectures for long document summarization. In: Proceedings of the 16th conference of the European chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19 - 23, 2021, pp 1792–1810
Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph Attention Networks. In: 6th international conference on learning representations, ICLR 2018,Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings
Lin C-Y (2004) ROUGE: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp 74–81
Paul O, James Y (2004) An Introduction to DUC-2004. In: Proceedings of the 4th document understanding conference
Wilson PK, Jeba JR (2022) A developed framework for multi-document summarization using softmax regression and spider monkey optimization methods. Soft Comput 26(7):3313–3328
Song M, Feng Y, Jing L (2022) A preliminary exploration of extractive multi-document summarization in hyperbolic space. In: Proceedings of the 31st ACM international conference on information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pp 4505–4509
Xiao W, Beltagy I, Carenini G, Cohan A (2022) PRIMERA: pyramid-based masked sentence pre-training for multi-document summarization. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp. 5245–5263
Fan A, Gardent C, Braud C, Bordes A (2019) Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs. In: Inui K, Jiang J, Ng V, Wan X (eds.) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pp. 4184–4194
Chowdhury T, Kumar S, Chakraborty T (2020) Neural Abstractive Summarization with Structural Attention. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, ijcai 2020, pp 3716–3722
Tang H, Ji D, Zhou Q (2021) Triple-based graph neural network for encoding event units in graph reasoning problems. Inf Sci 544:168–182
Zhao C, Huang T, Chowdhury SBR, Chandrasekaran MK, McKeown KR, Chaturvedi S (2022) Read top news first: A document reordering approach for multi-document news summarization. In: findings of the association for computational linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022, pp 613–621
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Kingma DP, Ba J (2015) Adam: A Method for Stochastic Optimization. In: 3rd international conference on learning representations, ICLR 2015,San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Gehrmann S, Deng Y, Rush A (2018) Bottom-Up Abstractive Summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4098–4109
Lebanoff L, Song K, Liu F (2018) Adapting the neural encoder-decoder framework from single to multi-document summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, 2018, pp 4131–4141
Acknowledgements
We thank the volunteers for helping us to judge the quality of system summaries in the human evaluation section.
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
SL contributed to data curation, proposal of the original methodology, software, validation, formal analysis, and writing the original draft. Jungang Xu contributed to conceptualization, investigation, supervision, reviewing, and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no financial or proprietary interests in any material discussed in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised to correct the first Author name.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, S., Xu, J. HierMDS: a hierarchical multi-document summarization model with global–local document dependencies. Neural Comput & Applic 35, 18553–18570 (2023). https://doi.org/10.1007/s00521-023-08680-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08680-0