research-article

Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model

Authors:

Maarten de RijkeAuthors Info & Claims

SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 95 - 104

https://doi.org/10.1145/3077136.3080792

Published: 07 August 2017 Publication History

Abstract

As a framework for extractive summarization, sentence regression has achieved state-of-the-art performance in several widely-used practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to encode a sentence into a feature vector. So far, sentence regression approaches have neglected to use features that capture contextual relations among sentences.

We propose a neural network model, Contextual Relation-based Summarization (CRSum), to take advantage of contextual relations among sentences so as to improve the performance of sentence regression. Specifically, we first use sentence relations with a word-level attentive pooling convolutional neural network to construct sentence representations. Then, we use contextual relations with a sentence-level attentive pooling recurrent neural network to construct context representations. Finally, CRSum automatically learns useful contextual features by jointly learning representations of sentences and similarity scores between a sentence and sentences in its context. Using a two-level attention mechanism, CRSum is able to pay attention to important content, i.e., words and sentences, in the surrounding context of a given sentence.

We carry out extensive experiments on six benchmark datasets. CRSum alone can achieve comparable performance with state-of-the-art approaches; when combined with a few basic surface features, it significantly outperforms the state-of-the-art in terms of multiple ROUGE metrics.

References

[1]

L. Bing, P. Li, Y. Liao, W. Lam, W. Guo, and R. J. Passonneau. Abstractive multi-document summarization via phrase selection and merging. In ACL, 2015.

[2]

Z. Cao, F. Wei, L. Dong, S. Li, and M. Zhou. Ranking with recursive neural networks and its application to multi-document summarization. In AAAI, 2015.

[3]

Z. Cao, F. Wei, S. Li, W. Li, M. Zhou, and H. Wang. Learning summary prior representation for extractive summarization. In ACL, 2015.

[4]

Z. Cao, W. Li, S. Li, and F. Wei. Attsum: Joint learning of focusing and summarization with neural attention. In COLING, 2016.

[5]

J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, 1998.

Digital Library

[6]

J. Cheng and M. Lapata. Neural summarization by extracting sentences and words. In ACL, 2016.

[7]

S. Chopra, M. Auli, and A. M. Rush. Abstractive sentence summarization with attentive recurrent neural networks. In NAACL-HLT, 2016.

[8]

C. N. dos Santos, M. Tan, B. Xiang, and B. Zhou. Attentive pooling networks. CoRR, 2016.

[9]

J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 12: 2121--2159, 2011.

Digital Library

[10]

G. Erkan and D. R. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. JAIR, 22 (1): 457--479, 2004.

[11]

K. Filippova, E. Alfonseca, C. A. Colmenares, L. Kaiser, and O. Vinyals. Sentence compression by deletion with LSTMs. In EMNLP, 2015.

[12]

M. W. Gardner and S. Dorling. Artificial neural networks (the multilayer perceptron) -- A review of applications in the atmospheric sciences. Atmospheric environment, 32 (14): 2627--2636, 1998.

[13]

D. Gillick and B. Favre. A scalable global model for summarization. In ILP-NLP, 2009.

[14]

J. Goldstein, V. Mittal, J. Carbonell, and M. Kantrowitz. Multi-document summarization by sentence extraction. In NAACL-ANLP, 2000.

[15]

A. Graves, A. Mohamed, and G. E. Hinton. Speech recognition with deep recurrent neural networks. In ICASSP, 2013.

[16]

K. Hong and A. Nenkova. Improving the estimation of word importance for news multi-document summarization. In EACL, 2014.

[17]

B. Hu, Q. Chen, and F. Zhu. LCSTS: A large scale Chinese short text summarization dataset. In EMNLP, 2015.

[18]

Y. Hu and X. Wan. PPSGen: Learning to generate presentation slides for academic papers. In IJCAI, 2013.

[19]

Y. Hu and X. Wan. PPSGen: Learning-based presentation slides generation for academic papers. TKDE, 27 (4): 1085--1097, 2015.

Digital Library

[20]

M. Kågebäck, O. Mogren, N. Tahmasebi, and D. Dubhashi. Extractive summarization using continuous vector space models. In CVSC@ EACL, 2014.

[21]

H. Kobayashi, M. Noguchi, and T. Yatsuka. Summarization based on embedding distributions. In EMNLP, 2015.

[22]

J. Kupiec, J. Pedersen, and F. Chen. A trainable document summarizer. In SIGIR, 1995.

Digital Library

[23]

C. Li, X. Qian, and Y. Liu. Using supervised bigram-based ILP for extractive summarization. In ACL, 2013.

[24]

S. Li, Y. Ouyang, W. Wang, and B. Sun. Multi-document summarization using support vector regression. In DUC, 2007.

[25]

C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In ACL, 2004.

[26]

H. Lin and J. Bilmes. Multi-document summarization via budgeted maximization of submodular functions. In NAACL-HLT, 2010.

[27]

H. Lin and J. Bilmes. A class of submodular functions for document summarization. In NAACL-HLT, 2011.

[28]

R. McDonald. A study of global inference algorithms in multi-document summarization. In ECIR, 2007.

[29]

R. Mihalcea. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In ACL, 2004.

Digital Library

[30]

R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In EMNLP, 2004.

[31]

R. Nallapati, B. Zhou, C. N. dos Santos, cC. Gülccehre, and B. Xiang. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In CoNLL, 2016.

[32]

G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions-i. MATH PROGRAM, 14 (1): 265--294, 1978.

Digital Library

[33]

Y. Ouyang, S. Li, and W. Li. Developing learning strategies for topic-based summarization. In CIKM, 2007.

Digital Library

[34]

Y. Ouyang, W. Li, S. Li, and Q. Lu. Applying regression models to query-focused multi-document summarization. IPM, 47 (2): 227--237, 2011.

Digital Library

[35]

P. Over and J. Yen. Introduction to DUC-2001: An intrinsic evaluation of generic news text summarization systems. In DUC, 2004.

[36]

K. Owczarzak, J. M. Conroy, H. T. Dang, and A. Nenkova. An assessment of the accuracy of automatic evaluation in summarization. In NAACL-HLT, 2012.

[37]

D. R. Radev, H. Jing, and M. Budzikowska. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In NAACL-ANLP, 2000.

[38]

D. R. Radev, H. Jing, M. Sty's, and D. Tam. Centroid-based summarization of multiple documents. IPM, 40 (6): 919--938, 2004.

Digital Library

[39]

P. A. Rankel, J. M. Conroy, H. T. Dang, and A. Nenkova. A decade of automatic content evaluation of news summaries: Reassessing the state of the art. In ACL, 2013.

[40]

P. Ren, F. Wei, Z. Chen, J. Ma, and M. Zhou. A redundancy-aware sentence regression framework for extractive summarization. In COLING, 2016.

[41]

D. W. Ruck, S. K. Rogers, M. Kabrisky, M. E. Oxley, and B. W. Suter. The multilayer perceptron as an approximation to a bayes optimal discriminant function. TNN, 1 (4): 296--298, 1990.

Digital Library

[42]

A. M. Rush, S. Chopra, and J. Weston. A neural attention model for abstractive sentence summarization. In EMNLP, 2015.

[43]

W. Song, T. Liu, R. Fu, L. Liu, H. Wang, and T. Liu. Learning to identify sentence parallelism in student essays. In COLING, 2016.

[44]

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 15 (1): 1929--1958, 2014.

Digital Library

[45]

X. Wan. An exploration of document impact on graph-based multi-document summarization. In EMNLP, 2008.

[46]

X. Wan. Using bilingual information for cross-language document summarization. In NAACL-HLT, 2011.

[47]

X. Wan and J. Xiao. Graph-based multi-modality learning for topic-focused multi-document summarization. In IJCAI, 2009.

[48]

X. Wan and J. Yang. Multi-document summarization using cluster-based link analysis. In SIGIR, 2008.

Digital Library

[49]

X. Wan and J. Zhang. CTSUM: extracting more certain summaries for news articles. In SIGIR, 2014.

Digital Library

[50]

X. Wan, Z. Cao, F. Wei, S. Li, and M. Zhou. Multi-document summarization via discriminative summary reranking. CoRR, 2015.

[51]

S. Yan and X. Wan. Deep dependency substructure-based learning for multidocument summarization. ACM Transactions on Information Systems (TOIS), 34 (1): 3, 2015.

Digital Library

[52]

C. yew Lin and E. Hovy. From single to multi-document summarization: A prototype system and its evaluation. In ACL, 2002.

[53]

W. Yin and Y. Pei. Optimizing sentence modeling and selection for document summarization. In IJCAI, 2015.

[54]

W. Yin, H. Schtze, B. Xiang, and B. Zhou. ABCNN: Attention-based convolutional neural network for modeling sentence pairs. TACL, 4 (1): 259--272, 2016.

Cited By

Apaydın ASöylemez EGüneş MGürel Söylemez TKoç Apaydın Z(2024)CERVICAL PROPRIOCEPTION AND VESTIBULAR FUNCTIONS IN PATIENTS WITH NECK PAIN AND CERVICOGENIC HEADACHE: A COMPARATIVE STUDYJournal of Turkish Spinal Surgery10.4274/jtss.galenos.2024.75047(113-118)Online publication date: 8-Aug-2024
https://doi.org/10.4274/jtss.galenos.2024.75047
Wang JGao XNie JWang XHuang LNie WJiang MWei Z(2024)Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinementInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10368561:3Online publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1016/j.ipm.2024.103685
Dixit UGupta SYadav AYadav D(2024)Analyzing the Impact of Extractive Summarization Techniques on Legal TextProceedings of Data Analytics and Management10.1007/978-981-99-6544-1_44(585-602)Online publication date: 14-Jan-2024
https://doi.org/10.1007/978-981-99-6544-1_44
Show More Cited By

Index Terms

Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Summarization

Recommendations

Sentence Relations for Extractive Summarization with Deep Neural Networks

Sentence regression is a type of extractive summarization that achieves state-of-the-art performance and is commonly used in practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to ...
Extractive text summarization using clustering-based topic modeling
Abstract
Text summarization is the process of converting the input document into a short form, provided that it preserves the overall meaning associated with it. Primarily, text summarization is achieved in two ways, i.e., abstractive and extractive. ...
Abstractive Summarization Improved by WordNet-Based Extractive Sentences
Natural Language Processing and Chinese Computing
Abstract
Recently, the seq2seq abstractive summarization models have achieved good results on the CNN/Daily Mail dataset. Still, how to improve abstractive methods with extractive methods is a good research direction, since extractive methods have their ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

August 2017

1476 pages

ISBN:9781450350228

DOI:10.1145/3077136

General Chairs:
Noriko Kando
National Institute of Informatics
,
Tetsuya Sakai
Waseda University
,
Hideo Joho
University of Tsukuba
,
Program Chairs:
Hang Li
Huawei Noah's Ark Lab
,
Arjen P. de Vries
Radboud University
,
Ryen W. White
Microsoft Cortana

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Natural Science Foundation of Shandong Province

Conference

SIGIR '17

Sponsor:

SIGIR

SIGIR '17: The 40th International ACM SIGIR conference on research and development in Information Retrieval

August 7 - 11, 2017

Tokyo, Shinjuku, Japan

Acceptance Rates

SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

63
Total Citations
View Citations
1,018
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)4

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Apaydın ASöylemez EGüneş MGürel Söylemez TKoç Apaydın Z(2024)CERVICAL PROPRIOCEPTION AND VESTIBULAR FUNCTIONS IN PATIENTS WITH NECK PAIN AND CERVICOGENIC HEADACHE: A COMPARATIVE STUDYJournal of Turkish Spinal Surgery10.4274/jtss.galenos.2024.75047(113-118)Online publication date: 8-Aug-2024
https://doi.org/10.4274/jtss.galenos.2024.75047
Wang JGao XNie JWang XHuang LNie WJiang MWei Z(2024)Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinementInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10368561:3Online publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1016/j.ipm.2024.103685
Dixit UGupta SYadav AYadav D(2024)Analyzing the Impact of Extractive Summarization Techniques on Legal TextProceedings of Data Analytics and Management10.1007/978-981-99-6544-1_44(585-602)Online publication date: 14-Jan-2024
https://doi.org/10.1007/978-981-99-6544-1_44
Yue WLei L(2023)Sentiment Analysis using a CNN-BiLSTM Deep Model Based on Attention ClassificationInformation10.47880/inf2603-0226:3(117-162)Online publication date: 15-Sep-2023
https://doi.org/10.47880/inf2603-02
Vakada LCh AMarreddy MOota SMamidi R(2023)GAE-ISUMM: Unsupervised Graph-based Summarization for Indian Languages2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191588(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191588
Lamsiyah SMahdaouy ASchommer C(2023)Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?IEEE Access10.1109/ACCESS.2023.331452411(99961-99976)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3314524
Nguyen VMai SNguyen M(2023)Learning to summarize multi-documents with local and global informationProgress in Artificial Intelligence10.1007/s13748-023-00302-z12:3(275-286)Online publication date: 19-May-2023
https://doi.org/10.1007/s13748-023-00302-z
Li SXu J(2023)HierMDS: a hierarchical multi-document summarization model with global–local document dependenciesNeural Computing and Applications10.1007/s00521-023-08680-035:25(18553-18570)Online publication date: 26-Jun-2023
https://doi.org/10.1007/s00521-023-08680-0
Yue YLi YZhan JGao Y(2023)Query focused summarization via relevance distillationNeural Computing and Applications10.1007/s00521-023-08525-w35:22(16543-16557)Online publication date: 26-Apr-2023
https://doi.org/10.1007/s00521-023-08525-w
Lamsiyah SSchommer C(2023)A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document SummarizationArtificial Intelligence and Machine Learning10.1007/978-3-031-39144-6_6(78-95)Online publication date: 4-Aug-2023
https://doi.org/10.1007/978-3-031-39144-6_6
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten