research-article

Summarizing Long-Form Document with Rich Discourse Information

Authors:
Tianyu Zhu

The University of Queensland, Brisbane, QLD, Australia

The University of Queensland, Brisbane, QLD, Australia
View Profile

,
Wen Hua

The University of Queensland, Brisbane, QLD, Australia

The University of Queensland, Brisbane, QLD, Australia
View Profile

,
Jianfeng Qu

Soochow University, Suzhou, China

Soochow University, Suzhou, China
View Profile

,
Xiaofang Zhou

The Hong Kong University of Science and Technology, Hong Kong, China

The Hong Kong University of Science and Technology, Hong Kong, China
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 2770–2779https://doi.org/10.1145/3459637.3482396

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 2770–2779

ABSTRACT

The development of existing extractive summarization models for long-form document summarization is hindered by two factors: 1) the computation of the summarization model will dramatically increase due to the sheer size of the input long document; 2) the discourse structural information in the long-form document has not been fully exploited. To address the two deficiencies, we propose HEROES, a novel extractive summarization model for summarizing long-form documents with rich discourse structural information. In particular, the HEROES model consists of two modules: 1) a content ranking module that ranks and selects salient sections and sentences to compose a short digest that empowers complex summarization models and serves as its input; 2) an extractive summarization module based on a heterogeneous graph with nodes from different discourse levels and elaborately designed edge connections to reflect the discourse hierarchy of the document and restrain the semantic drifts across section boundaries. Experimental results on benchmark datasets show that HEROES can achieve significantly better performance compared with various strong baselines.

References

Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Elizabeth Trippe, Juan Gutierrez, and Krys Kochut. 2017. Text Summarization Techniques: A Brief Survey. International Journal of Advanced Computer Science and Applications (2017), 397--405.Google ScholarCross Ref
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. [n.d.]. Neural Machine Translation by Jointly Learning to Align and Translate. In International Conference on Learning Representations.Google Scholar
Ziqiang Cao, Furu Wei, Wenjie Li, and Sujian Li. 2018. Faithful to the Original: Fact Aware Neural Abstractive Summarization. In Proceedings of AAAI Conference on Artificial Intelligence. 4784--4791.Google Scholar
Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. 2018. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In Proceedings of the North American Chapter of the Association for Computational Linguistics. 615--621.Google Scholar
Peng Cui, Le Hu, and Yuanchao Liu. 2020. Enhancing Extractive Text Summarization with Topic-Aware Graph Neural Networks. In Proceedings of the International Conference on Computational Linguistics. 5360--5371.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the North American Chapter of the Association for Computational Linguistics. 4171--4186.Google Scholar
Yue Dong, Andrei Mircea, and Jackie Chi Kit Cheung. 2021. Discourse-Aware Unsupervised Summarization for Long Scientific Documents. In Proceedings of the European Chapter of the Association for Computational Linguistics. 1089--1102.Google ScholarCross Ref
Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-Based Lexical Centrality as Salience in Text Summarization. J. Artif. Int. Res. (2004), 457--479. Google ScholarDigital Library
Alexios Gidiotis and Grigorios Tsoumakas. 2020. A Divide-and-Conquer Approach to the Summarization of Long Documents. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020), 3029--3040.Google ScholarDigital Library
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Computer Vision and Pattern Recognition. 770--778.Google Scholar
Karl Moritz Hermann, Tomávs Kovciský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. In Proceedings of Neural Information Processing Systems. 1693--1701. Google ScholarDigital Library
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation (1997), 1735--1780.Google Scholar
Ruipeng Jia, Yanan Cao, Hengzhu Tang, Fang Fang, Cong Cao, and Shi Wang. 2020. Neural Extractive Summarization with Hierarchical Attentive Heterogeneous Graph Network. In Proceedings of the Empirical Methods in Natural Language Processing. 3622--3631.Google ScholarCross Ref
Jyun-Yu Jiang, Mingyang Zhang, Cheng Li, Michael Bendersky, Nadav Golbandi, and Marc Najork. 2019. Semantic Text Matching for Long-Form Documents. In The World Wide Web Conference. 795--806. Google ScholarDigital Library
Hanqi Jin, Tianming Wang, and Xiaojun Wan. 2020. Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization. In Proceedings of the Association for Computational Linguistics. 6244--6254.Google ScholarCross Ref
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the Empirical Methods in Natural Language Processing. 1746--1751.Google ScholarCross Ref
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.Google Scholar
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.Google Scholar
Wojciech Kryscinski, Bryan McCann, Caiming Xiong, and Richard Socher. 2020. Evaluating the Factual Consistency of Abstractive Text Summarization. In Proceedings of the Empirical Methods in Natural Language Processing. 9332--9346.Google ScholarCross Ref
Yann Lecun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE (1998), 2278--2324.Google Scholar
Chin-Yew Lin and Eduard Hovy. 2002. Manual and automatic evaluation of summaries. In Proceedings of the ACL Workshop on Automatic Summarization. 45--51. Google ScholarDigital Library
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the Empirical Methods in Natural Language Processing. 1412--1421.Google Scholar
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Text. In Proceedings of the Empirical Methods in Natural Language Processing. 404--411.Google Scholar
N. Moratanch and S. Chitrakala. 2017. A Survey on Extractive Text Summarization. In International Conference on Computer, Communication and Signal Processing. 1--6.Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the Empirical Methods in Natural Language Processing. 1532--1543.Google ScholarCross Ref
Jonathan Pilault, Raymond Li, Sandeep Subramanian, and Chris Pal. 2020. On Extractive and Abstractive Neural Document Summarization with Transformer Language Models. In Proceedings of the Empirical Methods in Natural Language Processing. 9308--9319.Google ScholarCross Ref
Evan Sandhaus. 2008. The New York Times Annotated Corpus.Google Scholar
Mike Schuster and Kuldip Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing (1997), 2673--2681. Google ScholarDigital Library
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the Association for Computational Linguistics. 1073--1083.Google Scholar
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research (2014), 1929--1958. Google ScholarDigital Library
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, undefinedukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the Neural Information Processing Systems. 6000--6010. Google ScholarDigital Library
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.Google Scholar
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. In Proceedings of the Neural Information Processing Systems. 2692--2700. Google ScholarDigital Library
Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiu, and Xuanjing Huang. 2020. Heterogeneous Graph Neural Networks for Extractive Document Summarization. In Proceedings of the Association for Computational Linguistics. 6209--6219.Google ScholarCross Ref
Yang Wei. 2012. Document summarization method based on heterogeneous graph. In International Conference on Fuzzy Systems and Knowledge Discovery. IEEE, 1285--1289.Google ScholarCross Ref
Wen Xiao and Giuseppe Carenini. 2019. Extractive Summarization of Long Documents by Combining Global and Local Context. In Proceedings of the Empirical Methods in Natural Language Processing. 3011--3021.Google ScholarCross Ref
Wen Xiao and Giuseppe Carenini. 2020. Systematically Exploring Redundancy Reduction in Summarizing Long Documents. In Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics. 516--528.Google Scholar
Jiacheng Xu, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. Discourse-Aware Neural Extractive Text Summarization. In Proceedings of the Association for Computational Linguistics. 5021--5031.Google ScholarCross Ref
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical Attention Networks for Document Classification. In Proceedings of the North American Chapter of the Association for Computational Linguistics. 1480--1489.Google ScholarCross Ref
Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, and Dragomir Radev. 2017. Graph-based Neural Multi-Document Summarization. In Proceedings of the Computational Natural Language Learning. 452--462.Google ScholarCross Ref
Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2020. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In Proceedings of the International Conference on Machine Learning. 11328--11339.Google Scholar
Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the Association for Computational Linguistics. 6236--6247.Google ScholarCross Ref
Ming Zhong, Pengfei Liu, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. 2019. Searching for Effective Neural Extractive Summarization: What Works and What's Next. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1049--1058.Google ScholarCross Ref

Index Terms

Summarizing Long-Form Document with Rich Discourse Information
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Summarization

Recommendations

Extractive spoken document summarization for information retrieval

The purpose of extractive summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a target summarization ratio and then sequence them to form a concise summary. In this ...
Read More
Towards coherent single-document summarization: an integer linear programming-based approach
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

Automatic Text Summarization (ATS) is a viable option to reduce the content of textual documents, e.g., as a possible preprocessing step in many text mining applications. Single-document extractive summarizers have been developed based on different ...
Read More
Hybrid multi-document summarization using pre-trained language models
Abstract
Abstractive multi-document summarization is a type of automatic text summarization. It obtains information from multiple documents and generates a human-like summary from them. In this paper, we propose an abstractive multi-document ...
Highlights
- Introducing a multi-document summarizer, called HMSumm, based on pre-trained methods.
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
content ranking
discourse information
extractive summarization
heterogeneous graph
long-form document
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 302
  Total Downloads
- Downloads (Last 12 months)42
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Summarizing Long-Form Document with Rich Discourse Information

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Extractive spoken document summarization for information retrieval

Towards coherent single-document summarization: an integer linear programming-based approach

Hybrid multi-document summarization using pre-trained language models