research-article

Leveraging Salience Analysis and Sparse Attention for Long Document Summarization

Authors:
Zhihua Jiang

Department of Computer Science, Jinan University, China

Department of Computer Science, Jinan University, China

0000-0002-4216-106X
View Profile

,
Yaxuan Chen

Department of Computer Science, Jinan University, China

Department of Computer Science, Jinan University, China

0000-0002-4878-521X
View Profile

,
Dongning Rao

School of Computer, Guangdong University of Technology, China

School of Computer, Guangdong University of Technology, China

0000-0002-6306-6811
View Profile

NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information RetrievalDecember 2023Pages 44–50https://doi.org/10.1145/3639233.3639348

Published:05 March 2024Publication History

NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval

Pages 44–50

ABSTRACT

Extractive and abstractive summarization models have led to promising results in summarizing relatively short documents, but still face the challenge from longer-form documents (e.g., scientific papers). Specifically, extractive models produce inaccurate or redundant summaries due to their weak salience analysis, while transformer-based abstractive models suffer from the quadratic dependency on the sequence length for their full attention mechanism. To remedy this, we propose a novel hybrid model named LDSumm (Long Document Summarization), which is composed of an extractive module that enhances the salience analysis by leveraging hierarchical structure (especially section information) of a document, and an abstractive module that introduces sparse attention ideas to increase the input size of BART. We conduct extensive experiments on two scientific-paper datasets: arXiv and PubMed. Experimental results show that LDSumm outperforms the baseline BART and other comparison models and obtains greater gain on the longer-paper dataset arXiv.

References

Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The Long-Document Transformer. CoRR abs/2004.05150 (2020).Google Scholar
Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating Long Sequences with Sparse Transformers. CoRR abs/1904.10509 (2019).Google Scholar
Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. 2018. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In NAACL-HLT (2). Association for Computational Linguistics, 615–621.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186.Google Scholar
Günes Erkan and Dragomir R. Radev. 2011. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. CoRR abs/1109.2128 (2011).Google Scholar
Alexios Gidiotis and Grigorios Tsoumakas. 2020. A Divide-and-Conquer Approach to the Summarization of Long Documents. IEEE ACM Trans. Audio Speech Lang. Process. 28 (2020), 3029–3040.Google ScholarDigital Library
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 7871–7880.Google ScholarCross Ref
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain.Google Scholar
Chia-Wei Liu, Ryan Lowe, Iulian Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016. 2122–2132.Google ScholarCross Ref
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents. In AAAI. AAAI Press, 3075–3081.Google ScholarDigital Library
Michal Pietruszka, Lukasz Borchmann, and Lukasz Garncarek. 2022. Sparsifying Transformer Models with Trainable Representation Pooling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 8616–8633.Google ScholarCross Ref
Jonathan Pilault, Raymond Li, Sandeep Subramanian, and Chris Pal. 2020. On Extractive and Abstractive Neural Document Summarization with Transformer Language Models. In EMNLP (1). Association for Computational Linguistics, 9308–9319.Google Scholar
Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training.Google Scholar
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 3980–3990.Google ScholarCross Ref
Tobias Rohde, Xiaoxia Wu, and Yinhan Liu. 2021. Hierarchical Learning for Generation with Long Source Sequences. CoRR abs/2104.07545 (2021).Google Scholar
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In ACL (1). Association for Computational Linguistics, 1073–1083.Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008.Google ScholarDigital Library
Wen Xiao and Giuseppe Carenini. 2019. Extractive Summarization of Long Documents by Combining Global and Local Context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 3009–3019.Google ScholarCross Ref
Wen Xiao and Giuseppe Carenini. 2020. Systematically Exploring Redundancy Reduction in Summarizing Long Documents. In AACL/IJCNLP. Association for Computational Linguistics, 516–528.Google Scholar
Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontañón, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. 2020. Big Bird: Transformers for Longer Sequences. In NeurIPS.Google Scholar
Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2020. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 11328–11339.Google Scholar
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.Google Scholar

Index Terms

Leveraging Salience Analysis and Sparse Attention for Long Document Summarization
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

Multi-document summarisation using feature distribution analysis

Recently, opinion documents have been growing rapidly in an environment where anyone can express an opinion on the internet or SNS. This situation requires an automatic summarisation technique in order to understand the contents of large-scale opinion ...
Read More
Latent Dirichlet learning for document summarization
ICASSP '09: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

Automatic summarization is developed to extract the representative contents or sentences from a large corpus of documents. This paper presents a new hierarchical representation of words, sentences and documents in a corpus, and infers the Dirichlet ...
Read More
Topic analysis for topic-focused multi-document summarization
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Topic-focused multi-document summarization has been a challenging task because the created summary is required to be biased to the given topic or query. Existing methods consider the given topic as a single coarse unit and then directly incorporate the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval
December 2023
336 pages
ISBN:9798400709227
DOI:10.1145/3639233

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 March 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Long document summarization
Salience analysis
Sparse attention
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 6
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Leveraging Salience Analysis and Sparse Attention for Long Document Summarization

NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-document summarisation using feature distribution analysis

Latent Dirichlet learning for document summarization

Topic analysis for topic-focused multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Leveraging Salience Analysis and Sparse Attention for Long Document Summarization

NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-document summarisation using feature distribution analysis

Latent Dirichlet learning for document summarization

Topic analysis for topic-focused multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media