Abstract
This paper describes HW-TSC’s submission to the NLPCC 2022 dialogue text summarization task. We convert it into a sub-summary generation and a topic detection task. A sequence-to-sequence model Transformer is adopted as the foundational structure of our generation model. An ensemble topic detection model is used to filter uninformative summaries. On the other hand, we utilize multiple data processing and data augmentation methods to improve the effectiveness of the system. A constrained search method is used to construct generation model’s training pairs between sub-dialogues and sub-summaries. Multiple role-centric training data augmentation strategies are used to enhance both the generation model and the topic detection model. Our experiments demonstrate the effectiveness of these methods. Finally, we rank first with the highest ROUGE score of 51.764 in the test evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Caswell, I., Chelba, C., Grangier, D.: Tagged back-translation. In: Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) (2019)
Chen, J., Yang, D.: Multi-view sequence-to-sequence models with conversational structure for abstractive dialogue summarization. In: Empirical Methods in Natural Language Processing (2020)
Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Meeting of the Association for Computational Linguistics (2016)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics (2018)
Edunov, S., Ott, M., Auli, M., Grangier, D.: Understanding back-translation at scale. In: Empirical Methods in Natural Language Processing (2018)
El-Kassas, W.S., Salama, C., Rafea, A., Mohamed, H.K.: Automatic text summarization: a comprehensive survey. Expert Syst. Appl. 165, 113679 (2021)
Goo, C.W., Chen, Y.N.: Abstractive dialogue summarization with sentence-gated modeling optimized by dialogue acts. In: Spoken Language Technology Workshop (2018)
He, J., Gu, J., Shen, J., Ranzato, M.: Revisiting self-training for neural sequence generation. In: International Conference on Learning Representations (2020)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431. Association for Computational Linguistics, April 2017
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Meeting of the Association for Computational Linguistics (2019)
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, Barcelona, Spain, p. 7481. Association for Computational Linguistics, July 2004. https://aclanthology.org/W04-1013
Liu, J., et al.: Topic-aware contrastive learning for abstractive dialogue summarization. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, pp. 1229–1243. Association for Computational Linguistics, November 2021. https://doi.org/10.18653/v1/2021.findings-emnlp.106
Liu, J., et al.: Topic-aware contrastive learning for abstractive dialogue summarization. In: Empirical Methods in Natural Language Processing (2021)
Liu, Y., et al.: Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguist. 8, 726–742 (2020). https://doi.org/10.1162/tacl_a_00343, https://aclanthology.org/2020.tacl-1.47
Liu, Z., Ng, A., Lee, S., Aw, A., Chen, N.F.: Topic-aware pointer-generator networks for summarizing spoken conversations. In: IEEE Automatic Speech Recognition and Understanding Workshop (2019)
Loem, M., Takase, S., Kaneko, M., Okazaki, N.: ExtraPhrase: efficient data augmentation for abstractive summarization (2022)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv:1711.05101 (2017)
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Empirical Methods in Natural Language Processing (2004)
Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: National Conference on Artificial Intelligence (2016)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Tang, Y., et al.: Multilingual translation from denoising pre-training. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 3450–3466. Association for Computational Linguistics, August 2021. https://doi.org/10.18653/v1/2021.findings-acl.304
Vaswani, A., et al.: Attention is all you need. In: Neural Information Processing Systems (2017)
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. Learning (2019)
Zou, Y., et al.: Topic-oriented spoken dialogue summarization for customer service with saliency-aware topic modeling. arXiv:2012.07311 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Rao, Z. et al. (2022). Augmented Topic-Specific Summarization for Domain Dialogue Text. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13552. Springer, Cham. https://doi.org/10.1007/978-3-031-17189-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-17189-5_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17188-8
Online ISBN: 978-3-031-17189-5
eBook Packages: Computer ScienceComputer Science (R0)