Abstract
Recently, advanced techniques in deep learning such as recurrent neural network (GRU, LSTM and Bi-LSTM) and auto-encoding (attention-based transformer and BERT) have achieved great successes in multiple application domains including text summarization. Recent state-of-the-art encoding-based text summarization models such as BertSum, PreSum and DiscoBert have demonstrated significant improvements on extractive text summarization tasks. However, recent models still encounter common problems related to the language-specific dependency which requires the supports of the external NLP tools. Besides that, recent advanced text representation methods, such as BERT as the sentence-level textual encoder, also fail to fully capture the representation of a full-length document. To address these challenges, in this paper we proposed a novel
- T. Wang, P. Chen, and D. Simovici. 2016. A new evaluation measure using compression dissimilarity on text summarization. Applied Intelligence 45, 1 (2016), 127–134. Google ScholarDigital Library
- O. K. Oyedotun and A. Khashman. 2016. Document segmentation using textural features summarization and feedforward neural network. Applied Intelligence 45, 1 (2016), 198–212. Google ScholarDigital Library
- W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed. 2020. Automatic text summarization: A comprehensive survey. Expert Systems with Applications 113679, 2020.Google Scholar
- A. See, P. J. Liu, and C. D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics 2017.Google Scholar
- M. Gambhir and V. Gupta. 2017. Recent automatic text summarization techniques: A survey. Artificial Intelligence Review 47, 1 (2017), 1–66. Google ScholarDigital Library
- D. Bahdanau, K. Cho, and Y. Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR), 2015.Google Scholar
- C. Kedzie, K. McKeown, and H. Daume III. 2018. Content selection in deep learning models of summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.Google ScholarCross Ref
- J. Cheng and M. Lapata. 2016. Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.Google Scholar
- W. Xu, S. Li, and Y. Lu. 2020. Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning. Applied Intelligence 1–16, 2020.Google Scholar
- T. Wang, L. Liu, N. Liu, H. Zhang, L. Zhang, and S. Feng. 2020. A multi-label text classification method via dynamic semantic representation model and deep neural network. Applied Intelligence 50, 8 (2020), 2339–2351.Google ScholarCross Ref
- M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018.Google Scholar
- A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever. 2018. Improving language understanding by generative pre-training. OpenAI, 2018.Google Scholar
- J. Devlin, M. W. Chang, K. Lee, and K. Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019.Google Scholar
- Y. Liu and M. Lapata. 2019. Text summarization with pretrained encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.Google Scholar
- H. Zhang, J. Xu, and J. Wang. 2019. Pretraining-based natural language generation for text summarization. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), 2019.Google ScholarCross Ref
- W. Kryściński, R. Paulus, C. Xiong, and R. Socher. 2018. Improving abstraction in text summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.Google ScholarCross Ref
- W. Xiao and G. Carenini. 2019. Extractive summarization of long documents by combining global and local context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.Google ScholarCross Ref
- J. Xu, Z. Gan, Y. Cheng, and J. Liu. 2020. Discourse-aware neural extractive text summarization. In Proceedings of the 58th annual Meeting of the Association for Computational Linguistics 2020.Google ScholarCross Ref
- T. N. Kipf and M. Welling. 2017. Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR, 2017.Google Scholar
- R. Paulus, C. Xiong, and R. Socher. 2018. A deep reinforced model for abstractive summarization. In International Conference on Learning Representations, 2018.Google Scholar
- Q. Guo, J. Huang, N. Xiong, and P. Wang. 2019. MS-pointer network: Abstractive text summary based on multi-head self-attention. IEEE Access 7, (2019), 138603–138613.Google ScholarCross Ref
- S. Gehrmann, Y. Deng, and A. M. Rush. 2018. Bottom-up abstractive summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.Google ScholarCross Ref
- W. Li, X. Xiao, Y. Lyu, and Y. Wang. 2018. Improving neural abstractive document summarization with explicit information selection modeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.Google Scholar
- R. Pasunuru and M. Bansal. 2018. Multi-reward reinforced summarization with saliency and entailment. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018.Google Scholar
- T. Scialom, S. Lamprier, B. Piwowarski, and J. Staiano. 2019. Answers unite! unsupervised metrics for reinforced summarization models. In 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.Google ScholarCross Ref
- L. Liu, Y. Lu, M. Yang, Q. Qu, J. Zhu, and H. Li. 2018. Generative adversarial network for abstractive text summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.Google Scholar
- T. Zhang, H. Ji, and A. Sil. 2019. Joint entity and event extraction with generative adversarial imitation learning. Data Intelligence 1, 2 (2019), 99–120.Google ScholarCross Ref
- T. Sabbah, A. Selamat, M. H. Selamat, F. S. Al-Anzi, E. H. Viedma, O. Krejcar, and H. Fujita. 2017. Modified frequency-based term weighting schemes for text classification. Applied Soft Computing 58, (2017), 193–206.Google ScholarCross Ref
- R. Nallapati, F. Zhai, and B. Zhou. 2017. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017. Google ScholarDigital Library
- Z. Zhang, Y. Wu, H. Zhao, Z. Li, S. Zhang, X. Zhou, and X. Zhou. Semantics-aware BERT for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence.Google Scholar
- C. D. Manning, M. Surdeanu, J. Bauer, J. R. Finkel, S. Bethard, and D. McClosky. 2014. The stanford coreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations 2014.Google ScholarCross Ref
- R. Catelli, V. Casola, G. De Pietro, H. Fujita, and M. Esposito. 2021. Combining contextualized word representation and sub-document level analysis through Bi-LSTM+ CRF architecture for clinical de-identification. Knowledge-Based Systems 213, 106649, 2021.Google ScholarCross Ref
- J. Howard and S. Ruder. 2018. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018.Google Scholar
- VASWANI, Ashish et al. Attention is all you need. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations (ICRL), 2013.Google Scholar
- J. Pennington, R. Socher, and C. D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.Google ScholarCross Ref
- F. Rousseau, E. Kiagias, and M. Vazirgiannis. 2015. Text categorization as a graph classification problem. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015.Google Scholar
- L. Yao, C. Mao, and Y. Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 2019.Google Scholar
- K. Lee, L. He, M. Lewis, and L. Zettlemoyer. 2017. End-to-end neural coreference resolution. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP, 2017.Google Scholar
- G. Durrett, T. Berg-Kirkpatrick, and D. Klein. 2016. Learning-based single-document summarization with compression and anaphoricity constraints. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.Google Scholar
- C. Y. Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out 2004.Google Scholar
- W. Wang and B. Chang. 2016. Graph-based dependency parsing with bidirectional LSTM. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.Google Scholar
Index Terms
- SE4ExSum: An Integrated Semantic-aware Neural Approach with Graph Convolutional Network for Extractive Text Summarization
Recommendations
MuchSUM: Multi-channel Graph Neural Network for Extractive Summarization
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information RetrievalRecent studies of extractive text summarization have leveraged BERT for document encoding with breakthrough performance. However, when using a pre-trained BERT-based encoder, existing approaches for selecting representative sentences for text ...
State-of-the-art approach to extractive text summarization: a comprehensive review
AbstractWith the rapid growth of social media platforms, digitization of official records, and digital publication of articles, books, magazines, and newspapers, lots of data are generated every day. This data is a foundation of information and contains a ...
RankSum—An unsupervised extractive text summarization based on rank fusion
AbstractIn this paper, we propose Ranksum, an approach for extractive text summarization of single documents based on the rank fusion of four multi-dimensional sentence features extracted for each sentence: topic information, semantic content, ...
Graphical abstractDisplay Omitted
Highlights- A unified summarization framework with multi-dimensional sentence features.
- ...
Comments