skip to main content
research-article

Sentence Relations for Extractive Summarization with Deep Neural Networks

Published: 30 April 2018 Publication History

Abstract

Sentence regression is a type of extractive summarization that achieves state-of-the-art performance and is commonly used in practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to represent each sentence. In this article, we study the use of sentence relations, e.g., Contextual Sentence Relations (CSR), Title Sentence Relations (TSR), and Query Sentence Relations (QSR), so as to improve the performance of sentence regression. CSR, TSR, and QSR refer to the relations between a main body sentence and its local context, its document title, and a given query, respectively.
We propose a deep neural network model, Sentence Relation-based Summarization (SRSum), that consists of five sub-models, PriorSum, CSRSum, TSRSum, QSRSum, and SFSum. PriorSum encodes the latent semantic meaning of a sentence using a bi-gram convolutional neural network. SFSum encodes the surface information of a sentence, e.g., sentence length, sentence position, and so on. CSRSum, TSRSum, and QSRSum are three sentence relation sub-models corresponding to CSR, TSR, and QSR, respectively. CSRSum evaluates the ability of each sentence to summarize its local contexts. Specifically, CSRSum applies a CSR-based word-level and sentence-level attention mechanism to simulate the context-aware reading of a human reader, where words and sentences that have anaphoric relations or local summarization abilities are easily remembered and paid attention to. TSRSum evaluates the semantic closeness of each sentence with respect to its title, which usually reflects the main ideas of a document. TSRSum applies a TSR-based attention mechanism to simulate people’s reading ability with the main idea (title) in mind. QSRSum evaluates the relevance of each sentence with given queries for the query-focused summarization. QSRSum applies a QSR-based attention mechanism to simulate the attentive reading of a human reader with some queries in mind. The mechanism can recognize which parts of the given queries are more likely answered by a sentence under consideration. Finally as a whole, SRSum automatically learns useful latent features by jointly learning representations of query sentences, content sentences, and title sentences as well as their relations.
We conduct extensive experiments on six benchmark datasets, including generic multi-document summarization and query-focused multi-document summarization. On both tasks, SRSum achieves comparable or superior performance compared with state-of-the-art approaches in terms of multiple ROUGE metrics.

References

[1]
Regina Barzilay, Noemie Elhadad, and Kathleen R. McKeown. 2002. Inferring strategies for sentence ordering in multidocument news summarization. J. Artif. Intell. Res. 17, 1 (Aug. 2002), 35--55.
[2]
Lidong Bing, Piji Li, Yi Liao, Wai Lam, Weiwei Guo, and Rebecca J. Passonneau. 2015. Abstractive multi-document summarization via phrase selection and merging. In ACL. 1587--1597.
[3]
Danushka Bollegala, Naoaki Okazaki, and Mitsuru Ishizuka. 2010. A bottom-up approach to sentence ordering for multi-document summarization. Inform. Process. Manag. 46, 1 (2010), 89--109.
[4]
Ziqiang Cao, Wenjie Li, Sujian Li, and Furu Wei. 2016. AttSum: Joint learning of focusing and summarization with neural attention. In COLING. 547--556.
[5]
Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, and Ming Zhou. 2015a. Ranking with recursive neural networks and its application to multi-document summarization. In AAAI. 2153--2159.
[6]
Ziqiang Cao, Furu Wei, Sujian Li, Wenjie Li, Ming Zhou, and Houfeng Wang. 2015b. Learning summary prior representation for extractive summarization. In ACL. 829--833.
[7]
Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR. 335--336.
[8]
Jianpeng Cheng and Mirella Lapata. 2016. Neural summarization by extracting sentences and words. In ACL. 484--494.
[9]
Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In NAACL-HLT. 93--98.
[10]
Janara Christensen, {No Value} Mausam, Stephen Soderland, and Oren Etzioni. 2013. Towards coherent multi-document summarization. In HLT-NAACL. Association for Computational Linguistics, 1163--1173.
[11]
Cícero Nogueira dos Santos, Ming Tan, Bing Xiang, and Bowen Zhou. 2016. Attentive pooling networks. arXiv:1602.03609 (2016).
[12]
John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12 (2011), 2121--2159.
[13]
Ted Dunning. 1993. Accurate methods for the statistics of surprise and coincidence. Comput. Linguis. 19, 1 (1993), 61--74.
[14]
Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 1 (2004), 457--479.
[15]
Katja Filippova, Enrique Alfonseca, Carlos A. Colmenares, Lukasz Kaiser, and Oriol Vinyals. 2015. Sentence compression by deletion with LSTMs. In EMNLP. 360--368.
[16]
Matthew W. Gardner and Stephen R. Dorling. 1998. Artificial neural networks (the multilayer perceptron) -- A review of applications in the atmospheric sciences. Atmospheric Environ. 32, 14 (1998), 2627--2636.
[17]
Dan Gillick and Benoit Favre. 2009. A scalable global model for summarization. In ILP-NLP. 10--18.
[18]
Jade Goldstein, Vibhu Mittal, Jaime Carbonell, and Mark Kantrowitz. 2000. Multi-document summarization by sentence extraction. In NAACL-ANLP-AutoSum. 40--48.
[19]
Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech recognition with deep recurrent neural networks. In ICASSP. 6645--6649.
[20]
Kai Hong and Ani Nenkova. 2014. Improving the estimation of word importance for news multi-document summarization. In EACL. 712--721.
[21]
Baotian Hu, Qingcai Chen, and Fangze Zhu. 2015. LCSTS: A large scale Chinese short text summarization dataset. In EMNLP. 1967--1972.
[22]
Yue Hu and Xiaojun Wan. 2013. PPSGen: Learning to generate presentation slides for academic papers. In IJCAI. 2099--2105.
[23]
Yue Hu and Xiaojun Wan. 2015. PPSGen: Learning-based presentation slides generation for academic papers. IEEETrans. Knowl. Data Eng. 27, 4 (2015), 1085--1097.
[24]
Mikael Kågebäck, Olof Mogren, Nina Tahmasebi, and Devdatt Dubhashi. 2014. Extractive summarization using continuous vector space models. In CVSC@EACL. 31--39.
[25]
Hayato Kobayashi, Masaki Noguchi, and Taichi Yatsuka. 2015. Summarization based on embedding distributions. In EMNLP. 1984--1989.
[26]
Julian Kupiec, Jan Pedersen, and Francine Chen. 1995. A trainable document summarizer. In SIGIR. 68--73.
[27]
Mirella Lapata. 2003. Probabilistic text structuring: Experiments with sentence ordering. In ACL. Association for Computational Linguistics, 545--552.
[28]
Chen Li, Xian Qian, and Yang Liu. 2013. Using supervised bigram-based ILP for extractive summarization. In ACL. 1004--1013.
[29]
Piji Li, Zihao Wang, Wai Lam, Zhaochun Ren, and Lidong Bing. 2017. Salience estimation via variational auto-encoders for multi-document summarization. In AAAI. 3497--3503.
[30]
Sujian Li, You Ouyang, Wei Wang, and Bin Sun. 2007. Multi-document summarization using support vector regression. In DUC.
[31]
Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop. 74--81.
[32]
Chin-Yew Lin and Eduard Hovy. 2000. The automated acquisition of topic signatures for text summarization. In COLING. 495--501.
[33]
Chin-Yew Lin and Eduard Hovy. 2002. From single to multi-document summarization: A prototype system and its evaluation. In ACL. 457--464.
[34]
Hui Lin and Jeff Bilmes. 2010. Multi-document summarization via budgeted maximization of submodular functions. In NAACL-HLT. 912--920.
[35]
Hui Lin and Jeff Bilmes. 2011. A class of submodular functions for document summarization. In HLT. 510--520.
[36]
Hui Lin and Jeff Bilmes. 2012. Learning mixtures of submodular shells with application to document summarization. In UAI. 479--490.
[37]
Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 2 (1958), 159--165.
[38]
Ryan McDonald. 2007. A study of global inference algorithms in multi-document summarization. In ECIR. 557--564.
[39]
Rada Mihalcea. 2004. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In ACL. Article No. 20.
[40]
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. In EMNLP. 404--411.
[41]
Ramesh Nallapati, Bowen Zhou, Cícero Nogueira dos Santos, Çaglar Gülçehre, and Bing Xiang. 2016. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In CoNLL.
[42]
George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions—I. Math. Prog. Seri. B 14, 1 (1978), 265--294.
[43]
You Ouyang, Sujian Li, and Wenjie Li. 2007. Developing learning strategies for topic-based summarization. In CIKM. 79--86.
[44]
You Ouyang, Wenjie Li, Sujian Li, and Qin Lu. 2011. Applying regression models to query-focused multi-document summarization. Inform. Process. Manag. 47, 2 (2011), 227--237.
[45]
Paul Over and James Yen. 2004. Introduction to DUC-2001: An intrinsic evaluation of generic news text summarization systems. In DUC.
[46]
Karolina Owczarzak, John M. Conroy, Hoa Trang Dang, and Ani Nenkova. 2012. An assessment of the accuracy of automatic evaluation in summarization. In NAACL-HLT. 1--9.
[47]
Dragomir R. Radev, Hongyan Jing, and Malgorzata Budzikowska. 2000. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In NAACL-ANLP-AutoSum’00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization. 21--30.
[48]
Dragomir R. Radev, Hongyan Jing, Małgorzata Styś, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Inform. Process. Manag. 40, 6 (2004), 919--938.
[49]
Peter A. Rankel, John M. Conroy, Hoa Trang Dang, and Ani Nenkova. 2013. A decade of automatic content evaluation of news summaries: Reassessing the state of the art. In ACL. 131--136.
[50]
Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Jun Ma, and Maarten de Rijke. 2017. Leveraging contextual sentence relations for extractive summarization using a neural attention model. In SIGIR. ACM, 95--104.
[51]
Pengjie Ren, Furu Wei, Zhumin Chen, Jun Ma, and Ming Zhou. 2016a. A redundancy-aware sentence regression framework for extractive summarization. In COLING. 33--43.
[52]
Zhaochun Ren, Oana Inel, Lora Aroyo, and Maarten de Rijke. 2016b. Time-aware multi-viewpoint summarization of multilingual social text streams. In CIKM. 387--396.
[53]
Zhaochun Ren, Shangsong Liang, Edgar Meij, and Maarten de Rijke. 2013. Personalized time-aware tweets summarization. In SIGIR. 513--522.
[54]
Dennis W. Ruck, Steven K. Rogers, Matthew Kabrisky, Mark E. Oxley, and Bruce W. Suter. 1990. The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Netw. 1, 4 (1990), 296--298.
[55]
Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In EMNLP. 379--389.
[56]
Wei Song, Tong Liu, Ruiji Fu, Lizhen Liu, Hanshi Wang, and Ting Liu. 2016. Learning to identify sentence parallelism in student essays. In COLING. 794--803.
[57]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929--1958.
[58]
Sebastian Tschiatschek, Rishabh Iyer, Haochen Wei, and Jeff Bilmes. 2014. Learning mixtures of submodular functions for image collection summarization. In NIPS. 1413--1421.
[59]
Xiaojun Wan. 2008. An exploration of document impact on graph-based multi-document summarization. In EMNLP. 755--784.
[60]
Xiaojun Wan. 2011. Using bilingual information for cross-language document summarization. In NAACL-HLT. 1546--1555.
[61]
Xiaojun Wan, Ziqiang Cao, Furu Wei, Sujian Li, and Ming Zhou. 2015. Multi-document summarization via discriminative summary reranking. arXiv:1507.02062 (2015).
[62]
Xiaojun Wan and Jianguo Xiao. 2009. Graph-based multi-modality learning for topic-focused multi-document summarization. In IJCAI. 1586--1591.
[63]
Xiaojun Wan and Jianwu Yang. 2008. Multi-document summarization using cluster-based link analysis. In SIGIR. 299--306.
[64]
Xiaojun Wan and Jianmin Zhang. 2014. CTSUM: Extracting more certain summaries for news articles. In SIGIR. 787--796.
[65]
Mark Wasson. 1998. Using leading text for news summaries: Evaluation results and implications for commercial summarization applications. In ACL. 1364--1368.
[66]
Su Yan and Xiaojun Wan. 2015. Deep dependency substructure-based learning for multidocument summarization. ACM Trans. Inform. Syst. 34, 1 (2015), 3.
[67]
Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, and Dragomir R. Radev. 2017. Graph-based neural multi-document summarization. In CoNLL. ACL, 452--462.
[68]
Wenpeng Yin and Yulong Pei. 2015. Optimizing sentence modeling and selection for document summarization. In IJCAI. 1383--1389.
[69]
Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2016. ABCNN: Attention-based convolutional neural network for modeling sentence pairs. Trans. Association for Comput. Linguist. 4, 1 (2016), 259--272.

Cited By

View all
  • (2024)Smooth it up!: Extractive summary coherence enhancementJournal of Intelligent & Fuzzy Systems10.3233/JIFS-219353(1-14)Online publication date: 30-Mar-2024
  • (2024)A Survey on Transformer-Based Extractive Summarization Methods2024 19th Iranian Conference on Intelligent Systems (ICIS)10.1109/ICIS64839.2024.10887480(263-271)Online publication date: 23-Oct-2024
  • (2024)Automatic Summarization and Key Information Extraction Algorithm of Japanese Corpus Based on Deep Neural Network2024 International Conference on Electrical Drives, Power Electronics & Engineering (EDPEE)10.1109/EDPEE61724.2024.00152(790-795)Online publication date: 27-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 36, Issue 4
October 2018
365 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3211967
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2018
Accepted: 01 March 2018
Revised: 01 March 2018
Received: 01 October 2017
Published in TOIS Volume 36, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Extractive summarization
  2. attentive pooling
  3. neural network
  4. sentence relations

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Natural Science Foundation of China
  • Natural Science Foundation of Shandong province
  • Microsoft Research Ph.D. program
  • Google Faculty Research Awards program
  • Netherlands Organisation for Scientific Research (NWO)
  • European Community's Seventh Framework Programme
  • Netherlands Institute for Sound and Vision
  • Ahold Delhaize, Amsterdam Data Science
  • Bloomberg Research Grant program, Elsevier
  • Fundamental Research Funds of Shandong University

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Smooth it up!: Extractive summary coherence enhancementJournal of Intelligent & Fuzzy Systems10.3233/JIFS-219353(1-14)Online publication date: 30-Mar-2024
  • (2024)A Survey on Transformer-Based Extractive Summarization Methods2024 19th Iranian Conference on Intelligent Systems (ICIS)10.1109/ICIS64839.2024.10887480(263-271)Online publication date: 23-Oct-2024
  • (2024)Automatic Summarization and Key Information Extraction Algorithm of Japanese Corpus Based on Deep Neural Network2024 International Conference on Electrical Drives, Power Electronics & Engineering (EDPEE)10.1109/EDPEE61724.2024.00152(790-795)Online publication date: 27-Feb-2024
  • (2024)Automatic text summarization using deep reinforced model coupling contextualized word representation and attention mechanismMultimedia Tools and Applications10.1007/s11042-023-15589-283:1(733-762)Online publication date: 1-Jan-2024
  • (2024)Overview of Approaches for Increasing Coherence in Extractive SummariesAdvances in Information and Communication10.1007/978-3-031-53963-3_41(592-609)Online publication date: 17-Mar-2024
  • (2023)Automatic Skill-Oriented Question Generation and Recommendation for Intelligent Job InterviewsACM Transactions on Information Systems10.1145/360455242:1(1-32)Online publication date: 13-Jun-2023
  • (2023)Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?IEEE Access10.1109/ACCESS.2023.331452411(99961-99976)Online publication date: 2023
  • (2023)Enhanced sentence representation for extractive text summarization: Investigating the syntactic and semantic features and their contribution to sentence scoringExpert Systems with Applications10.1016/j.eswa.2023.120302227(120302)Online publication date: Oct-2023
  • (2023)Cross-modal knowledge guided model for abstractive summarizationComplex & Intelligent Systems10.1007/s40747-023-01170-910:1(577-594)Online publication date: 27-Jul-2023
  • (2023)A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document SummarizationArtificial Intelligence and Machine Learning10.1007/978-3-031-39144-6_6(78-95)Online publication date: 4-Aug-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media