Skip to main content
Log in

HierMDS: a hierarchical multi-document summarization model with global–local document dependencies

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

A Correction to this article was published on 10 May 2024

This article has been updated

Abstract

Multi-document summarization (MDS) has attracted increasing attention in recent years. Most existing MDS systems simply encode the flat connected sequence of multiple documents, which limits the representation capabilities for multi-documents. To address this issue, we propose a Hierarchical Multi-Document Summarization Model with Global-Local Document Dependencies (HierMDS). HierMDS consists of five sub-blocks, i.e., an embedding block, an internal document encoding block, a local document encoding block, a global document encoding block, and a fusion block, which are stacked in a hierarchical structure to gradually produce dependency-enriched document representations. Specifically, the embedding block encodes tokens, and the internal document encoding block encodes each document. Then, for a certain document, two kinds of document dependencies are extracted: (1) The global document dependency indicates that the representation of this document is affected by all the other documents. (2) The local document dependency indicates that the representation of this document is only affected by the relevant documents. We suppose that the global document dependency represents the global background information, while the local document dependency condenses the most relevant information. To be specific, the global document encoding block modeled with the vanilla transformer layer encodes the global document dependencies, and the local document encoding block modeled with the graph attention neural networks encodes the local document dependencies. Finally, HierMDS produces document dependency-enriched representations by fusing the local and global document dependencies with the fusion block. Experimental results on Multi-News and DUC-2004 datasets have demonstrated competitive advantages of HierMDS compared with several state-of-the-art MDS models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The datasets generated during and/or analyzed during the current study are available from the authors on reasonable request.

Change history

Notes

  1. Task 1 of SemEval-2017.

References

  1. Hu Y, Chen Y, Chou H (2017) Opinion mining from online hotel reviews - a text summarization approach. Inf Process Manag 53(2):436–449

    Article  Google Scholar 

  2. Paulus R, Xiong C, Socher R (2018) A Deep Reinforced Model for Abstractive Summarization. In: Proceedings of 6th international conference on learning representations, Vancouver, BC, Canada

  3. Baralis E, Cagliero L, Jabeen S, Fiori A (2012) Multi-document summarization exploiting frequent itemsets. In: Proceedings of the ACM symposium on applied computing, SAC 2012, Riva, Trento, Italy, March 26-30, 2012, pp 782–786

  4. Xu Y, Zhang X, Quan G, Wang Y (2013) MRS for multi-document summarization by sentence extraction. Telecommun Syst 53(1):91–98

    Article  Google Scholar 

  5. Mani I, Bloedorn E (1997) Multi-document summarization by graph search and matching. In: Proceedings of the fourteenth national conference on artificial intelligence and ninth innovative applications of artificial intelligence conference, AAAI 97, IAAI 97, July 27-31, 1997, Providence, Rhode Island, USA, pp 622–628

  6. Hu P, He J, Zhang Y (2015) Graph-based query-focused multi-document summarization using improved affinity graph. In: knowledge science, engineering and management - 8th international conference, KSEM 2015, Chongqing, China, October 28-30, 2015, Proceedings. Lecture Notes in Computer Science, vol. 9403, pp 336–347

  7. Ma C, Zhang WE, Guo M, Wang H, Sheng QZ (2023) Multi-document summarization via deep learning techniques: a survey. ACM Comput Surv 55(5):102–110237

    Article  Google Scholar 

  8. Li P, Lam W, Bing L, Guo W, Li H (2017) Cascaded attention based unsupervised information distillation for compressive summarization. In: Proceedings of the 2017 conference on empirical methods in natural language processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pp 2081–2090

  9. Cao Z, Li W, Li S, Wei F (2017) Improving multi-document summarization via text classification. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4-9, 2017, San Francisco, California, USA

  10. Cao Z, Wei F, Dong L, Li S, Zhou M (2015) Ranking with recursive neural networks and its application to multi-document summarization. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, January 25-30, 2015, Austin, Texas, USA, pp 2153–2159

  11. Liu PJ, Saleh M, Pot E, Goodrich B, Sepassi R, Kaiser L, Shazeer N (2018) Generating Wikipedia by Summarizing Long Sequences. In: 6th international conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings

  12. Fabbri AR, Li I, She T, Li S, Radev DR (2019) Multi-News: a large-scale multi-document summarization dataset and abstractive hierarchical model. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 1074–1084

  13. See A, Liu PJ, Manning CD (2017) Get To The point: summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1073–1083

  14. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You Need. In: advances in neural information processing systems 30: annual conference on neural Information processing systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp 5998–6008

  15. Carbonell JG, Goldstein J (1998) The Use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, August 24-28 1998, Melbourne, Australia, pp 335–336

  16. Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, pp 7871–7880

  17. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:140–114067

    MathSciNet  Google Scholar 

  18. Zhang J, Zhao Y, Saleh M, Liu PJ (2020) PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13-18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp 11328–11339

  19. Jin H, Wang T, Wan X (2020) Multi-granularity interaction network for extractive and abstractive multi-document summarization. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, pp 6244–6254

  20. Li W, Xiao X, Liu J, Wu H, Wang H, Du J (2020) Leveraging graph to improve abstractive multi-document summarization. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, pp. 6232–6243

  21. Liu Y, Lapata M (2019) Hierarchical transformers for multi-document summarization. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 5070–5081

  22. Ren P, Chen Z, Ren Z, Wei F, Ma J, de Rijke M (2017) Leveraging contextual sentence relations for extractive summarization using a neural attention model. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017, pp 95–104

  23. Pasunuru R, Liu M, Bansal M, Ravi S, Dreyer M (2021) Efficiently summarizing text and graph encodings of multi-document clusters. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pp 4768–4779

  24. Grail Q, Perez J, Gaussier É (2021) Globalizing bert-based transformer architectures for long document summarization. In: Proceedings of the 16th conference of the European chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19 - 23, 2021, pp 1792–1810

  25. Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph Attention Networks. In: 6th international conference on learning representations, ICLR 2018,Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings

  26. Lin C-Y (2004) ROUGE: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp 74–81

  27. Paul O, James Y (2004) An Introduction to DUC-2004. In: Proceedings of the 4th document understanding conference

  28. Wilson PK, Jeba JR (2022) A developed framework for multi-document summarization using softmax regression and spider monkey optimization methods. Soft Comput 26(7):3313–3328

    Article  Google Scholar 

  29. Song M, Feng Y, Jing L (2022) A preliminary exploration of extractive multi-document summarization in hyperbolic space. In: Proceedings of the 31st ACM international conference on information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pp 4505–4509

  30. Xiao W, Beltagy I, Carenini G, Cohan A (2022) PRIMERA: pyramid-based masked sentence pre-training for multi-document summarization. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp. 5245–5263

  31. Fan A, Gardent C, Braud C, Bordes A (2019) Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs. In: Inui K, Jiang J, Ng V, Wan X (eds.) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pp. 4184–4194

  32. Chowdhury T, Kumar S, Chakraborty T (2020) Neural Abstractive Summarization with Structural Attention. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, ijcai 2020, pp 3716–3722

  33. Tang H, Ji D, Zhou Q (2021) Triple-based graph neural network for encoding event units in graph reasoning problems. Inf Sci 544:168–182

    Article  MathSciNet  Google Scholar 

  34. Zhao C, Huang T, Chowdhury SBR, Chandrasekaran MK, McKeown KR, Chaturvedi S (2022) Read top news first: A document reordering approach for multi-document news summarization. In: findings of the association for computational linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022, pp 613–621

  35. Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450

  36. Kingma DP, Ba J (2015) Adam: A Method for Stochastic Optimization. In: 3rd international conference on learning representations, ICLR 2015,San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings

  37. Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479

    Article  Google Scholar 

  38. Gehrmann S, Deng Y, Rush A (2018) Bottom-Up Abstractive Summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4098–4109

  39. Lebanoff L, Song K, Liu F (2018) Adapting the neural encoder-decoder framework from single to multi-document summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, 2018, pp 4131–4141

Download references

Acknowledgements

We thank the volunteers for helping us to judge the quality of system summaries in the human evaluation section.

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

SL contributed to data curation, proposal of the original methodology, software, validation, formal analysis, and writing the original draft. Jungang Xu contributed to conceptualization, investigation, supervision, reviewing, and editing.

Corresponding author

Correspondence to Jungang Xu.

Ethics declarations

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised to correct the first Author name.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Xu, J. HierMDS: a hierarchical multi-document summarization model with global–local document dependencies. Neural Comput & Applic 35, 18553–18570 (2023). https://doi.org/10.1007/s00521-023-08680-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08680-0

Keywords