skip to main content
10.1145/3077136.3080792acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model

Published: 07 August 2017 Publication History

Abstract

As a framework for extractive summarization, sentence regression has achieved state-of-the-art performance in several widely-used practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to encode a sentence into a feature vector. So far, sentence regression approaches have neglected to use features that capture contextual relations among sentences.
We propose a neural network model, Contextual Relation-based Summarization (CRSum), to take advantage of contextual relations among sentences so as to improve the performance of sentence regression. Specifically, we first use sentence relations with a word-level attentive pooling convolutional neural network to construct sentence representations. Then, we use contextual relations with a sentence-level attentive pooling recurrent neural network to construct context representations. Finally, CRSum automatically learns useful contextual features by jointly learning representations of sentences and similarity scores between a sentence and sentences in its context. Using a two-level attention mechanism, CRSum is able to pay attention to important content, i.e., words and sentences, in the surrounding context of a given sentence.
We carry out extensive experiments on six benchmark datasets. CRSum alone can achieve comparable performance with state-of-the-art approaches; when combined with a few basic surface features, it significantly outperforms the state-of-the-art in terms of multiple ROUGE metrics.

References

[1]
L. Bing, P. Li, Y. Liao, W. Lam, W. Guo, and R. J. Passonneau. Abstractive multi-document summarization via phrase selection and merging. In ACL, 2015.
[2]
Z. Cao, F. Wei, L. Dong, S. Li, and M. Zhou. Ranking with recursive neural networks and its application to multi-document summarization. In AAAI, 2015.
[3]
Z. Cao, F. Wei, S. Li, W. Li, M. Zhou, and H. Wang. Learning summary prior representation for extractive summarization. In ACL, 2015.
[4]
Z. Cao, W. Li, S. Li, and F. Wei. Attsum: Joint learning of focusing and summarization with neural attention. In COLING, 2016.
[5]
J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, 1998.
[6]
J. Cheng and M. Lapata. Neural summarization by extracting sentences and words. In ACL, 2016.
[7]
S. Chopra, M. Auli, and A. M. Rush. Abstractive sentence summarization with attentive recurrent neural networks. In NAACL-HLT, 2016.
[8]
C. N. dos Santos, M. Tan, B. Xiang, and B. Zhou. Attentive pooling networks. CoRR, 2016.
[9]
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 12: 2121--2159, 2011.
[10]
G. Erkan and D. R. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. JAIR, 22 (1): 457--479, 2004.
[11]
K. Filippova, E. Alfonseca, C. A. Colmenares, L. Kaiser, and O. Vinyals. Sentence compression by deletion with LSTMs. In EMNLP, 2015.
[12]
M. W. Gardner and S. Dorling. Artificial neural networks (the multilayer perceptron) -- A review of applications in the atmospheric sciences. Atmospheric environment, 32 (14): 2627--2636, 1998.
[13]
D. Gillick and B. Favre. A scalable global model for summarization. In ILP-NLP, 2009.
[14]
J. Goldstein, V. Mittal, J. Carbonell, and M. Kantrowitz. Multi-document summarization by sentence extraction. In NAACL-ANLP, 2000.
[15]
A. Graves, A. Mohamed, and G. E. Hinton. Speech recognition with deep recurrent neural networks. In ICASSP, 2013.
[16]
K. Hong and A. Nenkova. Improving the estimation of word importance for news multi-document summarization. In EACL, 2014.
[17]
B. Hu, Q. Chen, and F. Zhu. LCSTS: A large scale Chinese short text summarization dataset. In EMNLP, 2015.
[18]
Y. Hu and X. Wan. PPSGen: Learning to generate presentation slides for academic papers. In IJCAI, 2013.
[19]
Y. Hu and X. Wan. PPSGen: Learning-based presentation slides generation for academic papers. TKDE, 27 (4): 1085--1097, 2015.
[20]
M. Kågebäck, O. Mogren, N. Tahmasebi, and D. Dubhashi. Extractive summarization using continuous vector space models. In CVSC@ EACL, 2014.
[21]
H. Kobayashi, M. Noguchi, and T. Yatsuka. Summarization based on embedding distributions. In EMNLP, 2015.
[22]
J. Kupiec, J. Pedersen, and F. Chen. A trainable document summarizer. In SIGIR, 1995.
[23]
C. Li, X. Qian, and Y. Liu. Using supervised bigram-based ILP for extractive summarization. In ACL, 2013.
[24]
S. Li, Y. Ouyang, W. Wang, and B. Sun. Multi-document summarization using support vector regression. In DUC, 2007.
[25]
C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In ACL, 2004.
[26]
H. Lin and J. Bilmes. Multi-document summarization via budgeted maximization of submodular functions. In NAACL-HLT, 2010.
[27]
H. Lin and J. Bilmes. A class of submodular functions for document summarization. In NAACL-HLT, 2011.
[28]
R. McDonald. A study of global inference algorithms in multi-document summarization. In ECIR, 2007.
[29]
R. Mihalcea. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In ACL, 2004.
[30]
R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In EMNLP, 2004.
[31]
R. Nallapati, B. Zhou, C. N. dos Santos, cC. Gülccehre, and B. Xiang. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In CoNLL, 2016.
[32]
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions-i. MATH PROGRAM, 14 (1): 265--294, 1978.
[33]
Y. Ouyang, S. Li, and W. Li. Developing learning strategies for topic-based summarization. In CIKM, 2007.
[34]
Y. Ouyang, W. Li, S. Li, and Q. Lu. Applying regression models to query-focused multi-document summarization. IPM, 47 (2): 227--237, 2011.
[35]
P. Over and J. Yen. Introduction to DUC-2001: An intrinsic evaluation of generic news text summarization systems. In DUC, 2004.
[36]
K. Owczarzak, J. M. Conroy, H. T. Dang, and A. Nenkova. An assessment of the accuracy of automatic evaluation in summarization. In NAACL-HLT, 2012.
[37]
D. R. Radev, H. Jing, and M. Budzikowska. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In NAACL-ANLP, 2000.
[38]
D. R. Radev, H. Jing, M. Sty's, and D. Tam. Centroid-based summarization of multiple documents. IPM, 40 (6): 919--938, 2004.
[39]
P. A. Rankel, J. M. Conroy, H. T. Dang, and A. Nenkova. A decade of automatic content evaluation of news summaries: Reassessing the state of the art. In ACL, 2013.
[40]
P. Ren, F. Wei, Z. Chen, J. Ma, and M. Zhou. A redundancy-aware sentence regression framework for extractive summarization. In COLING, 2016.
[41]
D. W. Ruck, S. K. Rogers, M. Kabrisky, M. E. Oxley, and B. W. Suter. The multilayer perceptron as an approximation to a bayes optimal discriminant function. TNN, 1 (4): 296--298, 1990.
[42]
A. M. Rush, S. Chopra, and J. Weston. A neural attention model for abstractive sentence summarization. In EMNLP, 2015.
[43]
W. Song, T. Liu, R. Fu, L. Liu, H. Wang, and T. Liu. Learning to identify sentence parallelism in student essays. In COLING, 2016.
[44]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 15 (1): 1929--1958, 2014.
[45]
X. Wan. An exploration of document impact on graph-based multi-document summarization. In EMNLP, 2008.
[46]
X. Wan. Using bilingual information for cross-language document summarization. In NAACL-HLT, 2011.
[47]
X. Wan and J. Xiao. Graph-based multi-modality learning for topic-focused multi-document summarization. In IJCAI, 2009.
[48]
X. Wan and J. Yang. Multi-document summarization using cluster-based link analysis. In SIGIR, 2008.
[49]
X. Wan and J. Zhang. CTSUM: extracting more certain summaries for news articles. In SIGIR, 2014.
[50]
X. Wan, Z. Cao, F. Wei, S. Li, and M. Zhou. Multi-document summarization via discriminative summary reranking. CoRR, 2015.
[51]
S. Yan and X. Wan. Deep dependency substructure-based learning for multidocument summarization. ACM Transactions on Information Systems (TOIS), 34 (1): 3, 2015.
[52]
C. yew Lin and E. Hovy. From single to multi-document summarization: A prototype system and its evaluation. In ACL, 2002.
[53]
W. Yin and Y. Pei. Optimizing sentence modeling and selection for document summarization. In IJCAI, 2015.
[54]
W. Yin, H. Schtze, B. Xiang, and B. Zhou. ABCNN: Attention-based convolutional neural network for modeling sentence pairs. TACL, 4 (1): 259--272, 2016.

Cited By

View all
  • (2024)CERVICAL PROPRIOCEPTION AND VESTIBULAR FUNCTIONS IN PATIENTS WITH NECK PAIN AND CERVICOGENIC HEADACHE: A COMPARATIVE STUDYJournal of Turkish Spinal Surgery10.4274/jtss.galenos.2024.75047(113-118)Online publication date: 8-Aug-2024
  • (2024)Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinementInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10368561:3Online publication date: 2-Jul-2024
  • (2024)Analyzing the Impact of Extractive Summarization Techniques on Legal TextProceedings of Data Analytics and Management10.1007/978-981-99-6544-1_44(585-602)Online publication date: 14-Jan-2024
  • Show More Cited By

Index Terms

  1. Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
    August 2017
    1476 pages
    ISBN:9781450350228
    DOI:10.1145/3077136
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. contextual sentence relation
    2. extractive summarization
    3. neural network

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR '17
    Sponsor:

    Acceptance Rates

    SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CERVICAL PROPRIOCEPTION AND VESTIBULAR FUNCTIONS IN PATIENTS WITH NECK PAIN AND CERVICOGENIC HEADACHE: A COMPARATIVE STUDYJournal of Turkish Spinal Surgery10.4274/jtss.galenos.2024.75047(113-118)Online publication date: 8-Aug-2024
    • (2024)Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinementInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10368561:3Online publication date: 2-Jul-2024
    • (2024)Analyzing the Impact of Extractive Summarization Techniques on Legal TextProceedings of Data Analytics and Management10.1007/978-981-99-6544-1_44(585-602)Online publication date: 14-Jan-2024
    • (2023)Sentiment Analysis using a CNN-BiLSTM Deep Model Based on Attention ClassificationInformation10.47880/inf2603-0226:3(117-162)Online publication date: 15-Sep-2023
    • (2023)GAE-ISUMM: Unsupervised Graph-based Summarization for Indian Languages2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191588(1-8)Online publication date: 18-Jun-2023
    • (2023)Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?IEEE Access10.1109/ACCESS.2023.331452411(99961-99976)Online publication date: 2023
    • (2023)Learning to summarize multi-documents with local and global informationProgress in Artificial Intelligence10.1007/s13748-023-00302-z12:3(275-286)Online publication date: 19-May-2023
    • (2023)HierMDS: a hierarchical multi-document summarization model with global–local document dependenciesNeural Computing and Applications10.1007/s00521-023-08680-035:25(18553-18570)Online publication date: 26-Jun-2023
    • (2023)Query focused summarization via relevance distillationNeural Computing and Applications10.1007/s00521-023-08525-w35:22(16543-16557)Online publication date: 26-Apr-2023
    • (2023)A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document SummarizationArtificial Intelligence and Machine Learning10.1007/978-3-031-39144-6_6(78-95)Online publication date: 4-Aug-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media