Abstract
Currently, the mainstream abstractive summarization method uses a machine learning model based on encoder-decoder architecture, and generally utilizes the encoder based on a recurrent neural network. The model mainly learns the serialized information of the text, but rarely learns the structured information. From the perspective of linguistics, the text structure information is effective in judging the importance of the text content. In order to enable the model to obtain text structure information, this paper proposes to use discourse relation in text summarization tasks, which can make the model focus on the important part of the text. Based on the traditional LSTM encoder, this paper adds graph convolutional networks to obtain the structural information of the text. In addition, this paper also proposes a fusion layer, which enables the model to pay attention to the serialized information of the text while acquiring the text structure information. The experimental results show that the system performance is significantly improved on ROUGE evaluation after joining discourse relation information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)
Nallapati, R., Zhou, B., Santos, C.N., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Conference on Computational Natural Language Learning, pp. 280–290 (2016)
Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv Computation and Language. arXiv:1406.1078 (2014)
Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv Learning arXiv:1609.02907 (2016)
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv Computation and Language. arXiv:1509.00685 (2015)
Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: Meeting of the association for Computational Linguistics, pp. 1631–1640 (2016)
Gulcehre, C., Ahn, S., Nallapati, R., Zhou, B., Bengio, Y.: Pointing the unknown words. arXiv Computation and Language. arXiv:1603.08148 (2016)
Cohan, A., et al.: A discourse-aware attention model for abstractive summarization of long documents. North American Chapter of the Association for Computational Linguistics, pp. 615–621 (2018)
Farzi, S., Faili, H., Kianian, S.: A neural reordering model based on phrasal dependency tree for statistical machine translation. Intell. Data Anal. 22(5), 1163–1183 (2018)
Jernite, Y., Bowman, S.R.: Discourse-based objectives for fast unsupervised sentence representation learning. arXiv Computation and Language. arXiv:1705.00557 (2017)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., et al.: Attention is all you need. arXiv Computation and Language. arXiv:1706.03762 (2017)
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M.: Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv Computation and Language. arXiv:1910.10683 (2019)
Graham, B.: Fractional max-pooling. arXiv Computer Vision and Pattern Recognition. arXiv:1412.6071 (2014)
Kim, T., Song, I., Bengio, Y., et al.: Dynamic layer normalization for adaptive neural acoustic modeling in speech recognition. arXiv Computation and Language. arXiv:1707.06065 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In; Computer Vision and Pattern Recognition, pp. 770–778 (2016)
See, A., Liu, P. J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv Learning. arXiv:1412.6980 (2014)
Lin, C.: ROUGE: a package for automatic evaluation of summaries. In: Meeting of the Association for Computational Linguistics, pp. 74–81 (2004)
Edunov, S., Baevski, A., Auli, M.: Pre-trained language model representations for language generation. arXiv Computation and Language. arXiv:1903.09722 (2019)
Peters, M.E., et al.: Deep contextualized word representations. In: North American Chapter of the Association for Computational Linguistics, pp. 2227–2237 (2018)
Gehrmann, S., Deng, Y., Rush, A.M.: Bottom-up abstractive summarization. Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)
Liao, P., Zhang, C., Chen, X., Zhou, X.: Improving abstractive text summarization with history aggregation. arXiv Computation and Language. arXiv:1912.11046 (2019)
Yan, Y., et al.: ProphetNet: predicting future n-gram for sequence-to-sequence pre-training. arXiv Computation and Language. arXiv:2001.04063 (2020)
Acknowledgments
This research is supported by National Natural Science Foundation of China (Grant No. 61976146, No. 61806137, No. 61702149, No. 61836007 and No. 61702518), and Jiangsu High School Research Grant (No. 18KJB520043).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wei, W., Wang, H., Wang, Z. (2020). Abstractive Summarization via Discourse Relation and Graph Convolutional Networks. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12431. Springer, Cham. https://doi.org/10.1007/978-3-030-60457-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-60457-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60456-1
Online ISBN: 978-3-030-60457-8
eBook Packages: Computer ScienceComputer Science (R0)