Abstract
Extractive query-focused multi-document summarization (QF-MDS) is the process of automatically generating an informative summary from a collection of documents that answers a pre-given query. Sentence and query representation is a fundamental cornerstone that affects the effectiveness of several QF-MDS methods. Transfer learning using pre-trained word embedding models has shown promising performance in many applications. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. In this paper, to deal with this issue, we propose to leverage transfer learning from pre-trained sentence embedding models to represent documents’ sentences and users’ queries using embedding vectors that capture the semantic and the syntactic relationships between their constituents (words, phrases). Furthermore, BM25 and semantic similarity function are linearly combined to retrieve a subset of sentences based on their relevance to the query. Finally, the maximal marginal relevance criterion is applied to re-rank the selected sentences by maintaining query relevance and minimizing redundancy. The proposed method is unsupervised, simple, efficient, and requires no labeled text summarization training data. Experiments are conducted using three standard datasets from the DUC evaluation campaign (DUC’2005–2007). The overall obtained results show that our method outperforms several state-of-the-art systems and achieves comparable results to the best performing systems, including supervised deep learning-based methods.
Similar content being viewed by others
Notes
https://duc.nist.gov/
-a -c 95 -m -n 2 -2 4 -u -p 0.5 -l 250
References
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on empirical methods in natural language processing, pp 632–642, https://doi.org/10.18653/v1/D15-1075
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, et al. (2020) Language models are few-shot learners. arXiv preprint arXiv:200514165
Canhasi E, Kononenko I (2014) Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization. Expert Syst Appl 41(2):535–543
Cao Z, Li W, Li S, Wei F, Li Y (2016) AttSum: Joint learning of focusing and summarization with neural attention. In: Proceedings of COLING 2016, the 26th International Conference on computational linguistics: Technical Papers, pp 547–556
Carbonell J, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on research and development in information retrieval, pp 335–336
Celikyilmaz A, Hakkani RD (2010) A hybrid hierarchical model for multi-document summarization. In: Proceedings of the 48th Annual Meeting of the Association for computational linguistics, pp 815–824
Cer D, Yang Y, Kong Sy, Hua N, Limtiaco N, John RS, Constant N, Guajardo-Cespedes M, Yuan S, Tar C, et al. (2018) Universal sentence encoder. arXiv preprint arXiv:180311175
Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on empirical methods in natural language processing, pp 670–680, https://doi.org/10.18653/v1/D17-1070
Conroy JM, Schlesinger JD, Stewart JG (2005) Classy query-based multi-document summarization. In: Proceedings of the Document Understanding Conference
Daumé III H, Marcu D (2006) Bayesian query-focused summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp 305–312
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, pp 4171–4186
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Eheela J, Janet B (2020) An abstractive summary generation system for customer reviews and news article using deep learning. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02412-1
Ethayarajh K (2018) Unsupervised random walk sentence embeddings: a strong but simple baseline. In: Proceedings of The Third Workshop on representation learning for NLP, pp 91–100
Fabbri A, Li I, She T, Li S, Radev D (2019) Multi-news: a large-scale multi-document summarization dataset and abstractive hierarchical model. In: Proceedings of the 57th Annual Meeting of the Association for computational linguistics, pp 1074–1084, https://doi.org/10.18653/v1/P19-1102
Feigenblat G, Roitman H, Boni O, Konopnicki D (2017) Unsupervised query-focused multi-document summarization using the cross entropy method. In: Proceedings of the 40th International ACM SIGIR Conference on research and development in information retrieval, pp 961–964
Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: the 2009 Annual Conference of the North American chapter of the association for computational linguistics, pp 362–370
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th Annual Meeting of the Association for computational linguistics (Volume 1: Long Papers), pp 328–339, https://doi.org/10.18653/v1/P18-1031
Iyyer M, Manjunatha V, Boyd-Graber J, Daumé III H (2015) Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd Annual Meeting of the association for computational linguistics and the 7th International Joint Conference on natural language processing, pp 1681–1691
Jain A, Bhatia D, Thakur MK (2017) Extractive text summarization using word vector embedding. In: 2017 International Conference on machine learning and data science (MLDS), pp 51–55
Joshi A, Fidalgo E, Alegre E, Fernández-Robles L (2019) Summcoder: an unsupervised framework for extractive text summarization based on deep auto-encoders. Expert Syst Appl 129:200–215
Kiros R, Zhu Y, Salakhutdinov RR, Zemel R, Urtasun R, Torralba A, Fidler S (2015) Skip-thought vectors. In: Cortes C, Lee DD, Sugiyama M, Garnett R (eds) Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, MIT Press, Montreal, Canada
Kobayashi H, Noguchi M, Yatsuka T (2015) Summarization based on embedding distributions. In: Proceedings of the 2015 Conference on empirical methods in natural language processing, EMNLP 2015, pp 1984–1989
Lamsiyah S, Mahdaouy AE, Espinasse B, Alaoui SOE (2020) An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2020.114152
Lebanoff L, Song K, Liu F (2018) Adapting the neural encoder-decoder framework from single to multi-document summarization. In: Proceedings of the 2018 Conference on empirical methods in natural language processing, pp 4131–4141, https://doi.org/10.18653/v1/D18-1446
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for computational linguistics, pp 7871–7880, https://doi.org/10.18653/v1/2020.acl-main.703
Lin CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, Association for computational linguistics, Barcelona, Spain, pp 74–81
Liu Y, Lapata M (2019) Text summarization with pretrained encoders. In: Proceedings of the 2019 Conference on empirical methods in natural language processing and the 9th International Joint Conference on natural language processing (EMNLP-IJCNLP), pp 3730–3740
Ma S, Deng ZH, Yang Y (2016) An unsupervised multi-document summarization framework based on neural document model. In: Proceedings of COLING 2016, the 26th International Conference on computational linguistics: Technical Papers, pp 1514–1523
Mao Y, Qu Y, Xie Y, Ren X, Han J (2020) Multi-document summarization with maximal marginal relevance-guided reinforcement learning. In: Proceedings of the 2020 Conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16-20, 2020, pp 1737–1751, https://doi.org/10.18653/v1/2020.emnlp-main.136
Nenkova A, McKeown K (2011) Automatic summarization. Found Trends® Inf Retrieval 5:103–233. https://doi.org/10.1561/1500000015
Nenkova A, McKeown K (2012) A survey of text summarization techniques. In: Aggarwal, Charu C (eds) Mining text data, Springer US, Boston, MA, pp 43–76. https://doi.org/10.1007/978-1-4614-3223-4_3
Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47(2):227–237
Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP), pp 1532–1543, https://doi.org/10.3115/v1/D14-1162
Radev DR, Jing H, Styś M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf?fbclid=IwAR1N9fMDU7Txt2Sv0Vw6e3TVtLY75qSKfJbPP6NfdVrwzJsl49B80dlJvk
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2019) Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:191010683
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on empirical methods in natural language processing, pp 2383–2392, https://doi.org/10.18653/v1/D16-1264
Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, Piscataway, NJ, USA 242:133–142
Ren P, Chen Z, Ren Z, Wei F, Ma J, de Rijke M (2017) Leveraging contextual sentence relations for extractive summarization using a neural attention model. In: Proceedings of the 40th International ACM SIGIR Conference on research and development in information retrieval, pp 95–104
Ren P, Chen Z, Ren Z, Wei F, Nie L, Ma J, De Rijke M (2018) Sentence relations for extractive summarization with deep neural networks. ACM Trans Inf Syst (TOIS) 36:1–32
Robertson SE, Walker S, Jones S, Hancock-Beaulieu MM, Gatford M (1995) Okapi at trec-3. In: Overview of the Third Text REtrieval Conference (TREC-3), Gaithersburg, MD: NIST, pp 109–126. https://www.microsoft.com/en-us/research/publication/okapi-at-trec-3/
Roitman H, Feigenblat G, Cohen D, Boni O, Konopnicki D (2020) Unsupervised dual-cascade learning with pseudo-feedback distillation for query-focused extractive summarization. In: WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, pp 2577–2584, https://doi.org/10.1145/3dblp366423.3380009
Rossiello G, Basile P, Semeraro G (2017) Centroid-based text summarization through compositionality of word embeddings. In: Proceedings of the MultiLing 2017 Workshop on summarization and summary evaluation across source types and genres, pp 12–21
Ruder S, Peters ME, Swayamdipta S, Wolf T (2019) Transfer learning in natural language processing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for computational linguistics: tutorials, pp 15–18
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Sakai T, et al. (2019) A comparative study of deep learning approaches for query-focused extractive multi-document summarization. In: 2019 IEEE 2nd International Conference on information and computer technologies (ICICT), IEEE, pp 153–157
Shen C, Li T, Ding CH (2011) Integrating clustering and multi-document summarization by bi-mixture probabilistic latent semantic analysis (plsa) with sentence bases. In: Twenty-Fifth AAAI Conference on artificial intelligence
Valizadeh M, Brazdil P (2015) Exploring actor-object relationships for query-focused multi-document summarization. Soft Comput 19(11):3109–3121
Van Lierde H, Chow TW (2019a) Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization. Inf Sci 496:212–224
Van Lierde H, Chow TW (2019b) Query-oriented text summarization based on hypergraph transversals. Inf Process Manag 56(4):1317–1338
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, Curran Associates, Inc, Long Beach, CA, USA
Wan X, Xiao J (2009) Graph-based multi-modality learning for topic-focused multi-document summarization. In: Twenty-First International Joint Conference on artificial intelligence
Wan X, Zhang J (2014) Ctsum: extracting more certain summaries for news articles. In: Proceedings of the 37th International ACM SIGIR Conference on research & development in information retrieval, pp 787–796
Wieting J, Gimpel K (2018) ParaNMT-50M: Pushing the limits of paraphrastic sentence embeddings with millions of machine translations. In: Proceedings of the 56th Annual Meeting of the Association for computational linguistics (Volume 1: Long Papers), pp 451–462, https://doi.org/10.18653/v1/P18-1042
Wu Y, Li Y, Xu Y (2019) Dual pattern-enhanced representations model for query-focused multi-document summarisation. Knowl-Based Syst 163:736–748
Xiong S, Ji D (2016) Query-focused multi-document summarization using hypergraph-based ranking. Inf Process Manag 52(4):670–681
Xu Y, Lapata M (2020) Coarse-to-fine query focused multi-document summarization. In: Proceedings of the 2020 Conference on empirical methods in natural language processing (EMNLP), pp 3632–3645, https://doi.org/10.18653/v1/2020.emnlp-main.296
Yao Jg, Wan X, Xiao J (2015) Compressive document summarization via sparse optimization. In: Twenty-Fourth International Joint Conference on artificial intelligence
Yousefi-Azar M, Hamey L (2017) Text summarization using unsupervised deep learning. Expert Syst Appl 68:93–105
Zhong M, Liu P, Chen Y, Wang D, Qiu X, Huang X (2020) Extractive summarization as text matching. In: Jurafsky D, Chai J, Schluter N, Tetreault JR (eds) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp 6197–6208, doi 10.18653/v1/2020.acl-main.552
Zhong Sh, Liu Y, Li B, Long J (2015) Query-oriented unsupervised multi-document summarization via deep learning model. Expert Syst Appl 42(21):8146–8155
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Example of generated summaries
Example of generated summaries
Table 6 presents the generated summaries using TF-IDF-Sum, GloVe-Sum, uSIF-Sum, USE-DAN-Sum, and USE-Transformer-Sum for the query D307 of the DUC’2005 dataset.
Rights and permissions
About this article
Cite this article
Lamsiyah, S., El Mahdaouy, A., Ouatik El Alaoui, S. et al. Unsupervised query-focused multi-document summarization based on transfer learning from sentence embedding models, BM25 model, and maximal marginal relevance criterion. J Ambient Intell Human Comput 14, 1401–1418 (2023). https://doi.org/10.1007/s12652-021-03165-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03165-1