skip to main content
10.1145/3459637.3482396acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Summarizing Long-Form Document with Rich Discourse Information

Authors Info & Claims
Published:30 October 2021Publication History

ABSTRACT

The development of existing extractive summarization models for long-form document summarization is hindered by two factors: 1) the computation of the summarization model will dramatically increase due to the sheer size of the input long document; 2) the discourse structural information in the long-form document has not been fully exploited. To address the two deficiencies, we propose HEROES, a novel extractive summarization model for summarizing long-form documents with rich discourse structural information. In particular, the HEROES model consists of two modules: 1) a content ranking module that ranks and selects salient sections and sentences to compose a short digest that empowers complex summarization models and serves as its input; 2) an extractive summarization module based on a heterogeneous graph with nodes from different discourse levels and elaborately designed edge connections to reflect the discourse hierarchy of the document and restrain the semantic drifts across section boundaries. Experimental results on benchmark datasets show that HEROES can achieve significantly better performance compared with various strong baselines.

References

  1. Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Elizabeth Trippe, Juan Gutierrez, and Krys Kochut. 2017. Text Summarization Techniques: A Brief Survey. International Journal of Advanced Computer Science and Applications (2017), 397--405.Google ScholarGoogle ScholarCross RefCross Ref
  2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. [n.d.]. Neural Machine Translation by Jointly Learning to Align and Translate. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  3. Ziqiang Cao, Furu Wei, Wenjie Li, and Sujian Li. 2018. Faithful to the Original: Fact Aware Neural Abstractive Summarization. In Proceedings of AAAI Conference on Artificial Intelligence. 4784--4791.Google ScholarGoogle Scholar
  4. Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. 2018. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In Proceedings of the North American Chapter of the Association for Computational Linguistics. 615--621.Google ScholarGoogle Scholar
  5. Peng Cui, Le Hu, and Yuanchao Liu. 2020. Enhancing Extractive Text Summarization with Topic-Aware Graph Neural Networks. In Proceedings of the International Conference on Computational Linguistics. 5360--5371.Google ScholarGoogle ScholarCross RefCross Ref
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the North American Chapter of the Association for Computational Linguistics. 4171--4186.Google ScholarGoogle Scholar
  7. Yue Dong, Andrei Mircea, and Jackie Chi Kit Cheung. 2021. Discourse-Aware Unsupervised Summarization for Long Scientific Documents. In Proceedings of the European Chapter of the Association for Computational Linguistics. 1089--1102.Google ScholarGoogle ScholarCross RefCross Ref
  8. Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-Based Lexical Centrality as Salience in Text Summarization. J. Artif. Int. Res. (2004), 457--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Alexios Gidiotis and Grigorios Tsoumakas. 2020. A Divide-and-Conquer Approach to the Summarization of Long Documents. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020), 3029--3040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Computer Vision and Pattern Recognition. 770--778.Google ScholarGoogle Scholar
  11. Karl Moritz Hermann, Tomávs Kovciský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. In Proceedings of Neural Information Processing Systems. 1693--1701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation (1997), 1735--1780.Google ScholarGoogle Scholar
  13. Ruipeng Jia, Yanan Cao, Hengzhu Tang, Fang Fang, Cong Cao, and Shi Wang. 2020. Neural Extractive Summarization with Hierarchical Attentive Heterogeneous Graph Network. In Proceedings of the Empirical Methods in Natural Language Processing. 3622--3631.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jyun-Yu Jiang, Mingyang Zhang, Cheng Li, Michael Bendersky, Nadav Golbandi, and Marc Najork. 2019. Semantic Text Matching for Long-Form Documents. In The World Wide Web Conference. 795--806. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hanqi Jin, Tianming Wang, and Xiaojun Wan. 2020. Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization. In Proceedings of the Association for Computational Linguistics. 6244--6254.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the Empirical Methods in Natural Language Processing. 1746--1751.Google ScholarGoogle ScholarCross RefCross Ref
  17. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  18. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  19. Wojciech Kryscinski, Bryan McCann, Caiming Xiong, and Richard Socher. 2020. Evaluating the Factual Consistency of Abstractive Text Summarization. In Proceedings of the Empirical Methods in Natural Language Processing. 9332--9346.Google ScholarGoogle ScholarCross RefCross Ref
  20. Yann Lecun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE (1998), 2278--2324.Google ScholarGoogle Scholar
  21. Chin-Yew Lin and Eduard Hovy. 2002. Manual and automatic evaluation of summaries. In Proceedings of the ACL Workshop on Automatic Summarization. 45--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the Empirical Methods in Natural Language Processing. 1412--1421.Google ScholarGoogle Scholar
  23. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Text. In Proceedings of the Empirical Methods in Natural Language Processing. 404--411.Google ScholarGoogle Scholar
  24. N. Moratanch and S. Chitrakala. 2017. A Survey on Extractive Text Summarization. In International Conference on Computer, Communication and Signal Processing. 1--6.Google ScholarGoogle Scholar
  25. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the Empirical Methods in Natural Language Processing. 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jonathan Pilault, Raymond Li, Sandeep Subramanian, and Chris Pal. 2020. On Extractive and Abstractive Neural Document Summarization with Transformer Language Models. In Proceedings of the Empirical Methods in Natural Language Processing. 9308--9319.Google ScholarGoogle ScholarCross RefCross Ref
  27. Evan Sandhaus. 2008. The New York Times Annotated Corpus.Google ScholarGoogle Scholar
  28. Mike Schuster and Kuldip Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing (1997), 2673--2681. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the Association for Computational Linguistics. 1073--1083.Google ScholarGoogle Scholar
  30. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, undefinedukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the Neural Information Processing Systems. 6000--6010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  33. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. In Proceedings of the Neural Information Processing Systems. 2692--2700. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiu, and Xuanjing Huang. 2020. Heterogeneous Graph Neural Networks for Extractive Document Summarization. In Proceedings of the Association for Computational Linguistics. 6209--6219.Google ScholarGoogle ScholarCross RefCross Ref
  35. Yang Wei. 2012. Document summarization method based on heterogeneous graph. In International Conference on Fuzzy Systems and Knowledge Discovery. IEEE, 1285--1289.Google ScholarGoogle ScholarCross RefCross Ref
  36. Wen Xiao and Giuseppe Carenini. 2019. Extractive Summarization of Long Documents by Combining Global and Local Context. In Proceedings of the Empirical Methods in Natural Language Processing. 3011--3021.Google ScholarGoogle ScholarCross RefCross Ref
  37. Wen Xiao and Giuseppe Carenini. 2020. Systematically Exploring Redundancy Reduction in Summarizing Long Documents. In Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics. 516--528.Google ScholarGoogle Scholar
  38. Jiacheng Xu, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. Discourse-Aware Neural Extractive Text Summarization. In Proceedings of the Association for Computational Linguistics. 5021--5031.Google ScholarGoogle ScholarCross RefCross Ref
  39. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical Attention Networks for Document Classification. In Proceedings of the North American Chapter of the Association for Computational Linguistics. 1480--1489.Google ScholarGoogle ScholarCross RefCross Ref
  40. Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, and Dragomir Radev. 2017. Graph-based Neural Multi-Document Summarization. In Proceedings of the Computational Natural Language Learning. 452--462.Google ScholarGoogle ScholarCross RefCross Ref
  41. Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2020. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In Proceedings of the International Conference on Machine Learning. 11328--11339.Google ScholarGoogle Scholar
  42. Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the Association for Computational Linguistics. 6236--6247.Google ScholarGoogle ScholarCross RefCross Ref
  43. Ming Zhong, Pengfei Liu, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. 2019. Searching for Effective Neural Extractive Summarization: What Works and What's Next. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1049--1058.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Summarizing Long-Form Document with Rich Discourse Information

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
      October 2021
      4966 pages
      ISBN:9781450384469
      DOI:10.1145/3459637

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader