Augmented Topic-Specific Summarization for Domain Dialogue Text

Rao, Zhiqiang; Wei, Daimeng; Li, Zongyao; Shang, Hengchao; Yang, Jinlong; Yu, Zhengzhe; Li, Shaojun; Wu, Zhanglin; Lei, Lizhi; Yang, Hao; Qin, Ying

doi:10.1007/978-3-031-17189-5_23

Zhiqiang Rao¹¹,
Daimeng Wei¹¹,
Zongyao Li¹¹,
Hengchao Shang¹¹,
Jinlong Yang¹¹,
Zhengzhe Yu¹¹,
Shaojun Li¹¹,
Zhanglin Wu¹¹,
Lizhi Lei¹¹,
Hao Yang¹¹ &
…
Ying Qin¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13552))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

798 Accesses

Abstract

This paper describes HW-TSC’s submission to the NLPCC 2022 dialogue text summarization task. We convert it into a sub-summary generation and a topic detection task. A sequence-to-sequence model Transformer is adopted as the foundational structure of our generation model. An ensemble topic detection model is used to filter uninformative summaries. On the other hand, we utilize multiple data processing and data augmentation methods to improve the effectiveness of the system. A constrained search method is used to construct generation model’s training pairs between sub-dialogues and sub-summaries. Multiple role-centric training data augmentation strategies are used to enhance both the generation model and the topic detection model. Our experiments demonstrate the effectiveness of these methods. Finally, we rank first with the highest ROUGE score of 51.764 in the test evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Caswell, I., Chelba, C., Grangier, D.: Tagged back-translation. In: Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) (2019)
Google Scholar
Chen, J., Yang, D.: Multi-view sequence-to-sequence models with conversational structure for abstractive dialogue summarization. In: Empirical Methods in Natural Language Processing (2020)
Google Scholar
Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Meeting of the Association for Computational Linguistics (2016)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics (2018)
Google Scholar
Edunov, S., Ott, M., Auli, M., Grangier, D.: Understanding back-translation at scale. In: Empirical Methods in Natural Language Processing (2018)
Google Scholar
El-Kassas, W.S., Salama, C., Rafea, A., Mohamed, H.K.: Automatic text summarization: a comprehensive survey. Expert Syst. Appl. 165, 113679 (2021)
Article Google Scholar
Goo, C.W., Chen, Y.N.: Abstractive dialogue summarization with sentence-gated modeling optimized by dialogue acts. In: Spoken Language Technology Workshop (2018)
Google Scholar
He, J., Gu, J., Shen, J., Ranzato, M.: Revisiting self-training for neural sequence generation. In: International Conference on Learning Representations (2020)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431. Association for Computational Linguistics, April 2017
Google Scholar
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Meeting of the Association for Computational Linguistics (2019)
Google Scholar
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, Barcelona, Spain, p. 7481. Association for Computational Linguistics, July 2004. https://aclanthology.org/W04-1013
Liu, J., et al.: Topic-aware contrastive learning for abstractive dialogue summarization. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, pp. 1229–1243. Association for Computational Linguistics, November 2021. https://doi.org/10.18653/v1/2021.findings-emnlp.106
Liu, J., et al.: Topic-aware contrastive learning for abstractive dialogue summarization. In: Empirical Methods in Natural Language Processing (2021)
Google Scholar
Liu, Y., et al.: Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguist. 8, 726–742 (2020). https://doi.org/10.1162/tacl_a_00343, https://aclanthology.org/2020.tacl-1.47
Liu, Z., Ng, A., Lee, S., Aw, A., Chen, N.F.: Topic-aware pointer-generator networks for summarizing spoken conversations. In: IEEE Automatic Speech Recognition and Understanding Workshop (2019)
Google Scholar
Loem, M., Takase, S., Kaneko, M., Okazaki, N.: ExtraPhrase: efficient data augmentation for abstractive summarization (2022)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv:1711.05101 (2017)
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Empirical Methods in Natural Language Processing (2004)
Google Scholar
Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: National Conference on Artificial Intelligence (2016)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Tang, Y., et al.: Multilingual translation from denoising pre-training. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 3450–3466. Association for Computational Linguistics, August 2021. https://doi.org/10.18653/v1/2021.findings-acl.304
Vaswani, A., et al.: Attention is all you need. In: Neural Information Processing Systems (2017)
Google Scholar
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. Learning (2019)
Google Scholar
Zou, Y., et al.: Topic-oriented spoken dialogue summarization for customer service with saliency-aware topic modeling. arXiv:2012.07311 (2020)

Download references

Author information

Authors and Affiliations

Huawei Translation Service Center, Beijing, China
Zhiqiang Rao, Daimeng Wei, Zongyao Li, Hengchao Shang, Jinlong Yang, Zhengzhe Yu, Shaojun Li, Zhanglin Wu, Lizhi Lei, Hao Yang & Ying Qin

Authors

Zhiqiang Rao
View author publications
You can also search for this author in PubMed Google Scholar
Daimeng Wei
View author publications
You can also search for this author in PubMed Google Scholar
Zongyao Li
View author publications
You can also search for this author in PubMed Google Scholar
Hengchao Shang
View author publications
You can also search for this author in PubMed Google Scholar
Jinlong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengzhe Yu
View author publications
You can also search for this author in PubMed Google Scholar
Shaojun Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhanglin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lizhi Lei
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Yang .

Editor information

Editors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Wei Lu
Nanjing University, Nanjing, China
Shujian Huang
Soochow University, Suzhou, China
Yu Hong
Soochow University, Soochow, China
Xiabing Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rao, Z. et al. (2022). Augmented Topic-Specific Summarization for Domain Dialogue Text. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13552. Springer, Cham. https://doi.org/10.1007/978-3-031-17189-5_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-17189-5_23
Published: 24 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17188-8
Online ISBN: 978-3-031-17189-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Augmented Topic-Specific Summarization for Domain Dialogue Text