Dialogue Topic Extraction as Sentence Sequence Labeling

Pan, Dinghao; Yang, Zhihao; Tan, Haixin; Wu, Jiangming; Lin, Hongfei

doi:10.1007/978-3-031-17189-5_21

Dinghao Pan¹¹,
Zhihao Yang¹¹,
Haixin Tan¹¹,
Jiangming Wu¹¹ &
…
Hongfei Lin¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13552))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

858 Accesses

Abstract

The topic information of the dialogue text is important for the model to understand the intentions of the dialogue participants and to abstractly summarize the content of the dialogue. The dialogue topic extraction task aims to extract the evolving topic information in long dialogue texts. In this work, we focus on topic extraction of dialogue texts in customer service scenarios. Based on the rich sequence features in the topic tags, we define this task as a sequence labeling task with sentences as the basic elements. For this task, we build a dialogue topic extraction system using a Chinese pre-trained language model and a CRF model. In addition, we use sliding windows to avoid excessive loss of contextual information, and use adversarial training and model integration to improve the performance and robustness of our model. Our system ranks first on the track 1 of the NLPCC-2022 shared task on Dialogue Text Analysis, Topic Extraction and Dialogue Summary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Trivedi, A., Pant, N., Shah, P., Sonik, S., Agrawal, S.: Speech to text and text to speech recognition systems-a review. IOSR J. Comput. Eng 20(2), 3643 (2018)
Google Scholar
Wang, L., Yao, J., Tao, Y., Zhong, L., Liu, W., Du, Q.: A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. arXiv preprint arXiv:1805.03616 (2018)
Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization. arXiv preprint arXiv:1808.08745 (2018)
Wang, J., et al.: Sentiment classification in customer service dialogue with topic-aware multi-task learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9177–9184 (2020)
Google Scholar
Xu, B., Wang, Q., Lyu, Y., Zhu, Y., Mao, Z.: Entity structure within and through- out: modeling mention dependencies for document-level relation extraction (2021)
Google Scholar
Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725 (2016)
Jung, Y.: Multiple predicting K-fold cross-validation for model selection. J. Nonparametr. Stat. 30(1), 197215 (2018)
Article MathSciNet Google Scholar
Luo, L., Lai, P.T., Wei, C.H., Lu, Z.: Extracting drug-protein interaction using an ensemble of biomedical pre-trained language models through sequence labeling and text classification techniques. In: Proceedings of the BioCreative VII Challenge Evaluation Workshop, pp. 26–30 (2021)
Google Scholar
Liu, J., et al.: Topic- aware contrastive learning for abstractive dialogue summarization. arXiv preprint arXiv:2109.04994 (2021)
Xu, Y., Zhao, H., Zhang, Z.: Topicaware multi-turn dialogue modeling. In: The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) (2021)
Google Scholar
Zou, Y., et al.: Topic-oriented spoken dialogue summarization for customer service with saliency-aware topic modeling. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14665–14673 (2021)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Kumawat, D., Jain, V.: POS tagging approaches: a comparison. Int. J. Comput. Appl. 118(6), 32–38 (2015)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016)
JingHui, X., BingQuan, L., XiaoLong, W.: Principles of non-stationary hidden markov model and its applications to sequence labeling task. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 827–837. Springer, Heidelberg (2005). https://doi.org/10.1007/11562214_72
Chapter Google Scholar
Liu, Z., Zhu, C., Zhao, T.: Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words? In: Huang, D.-S., Zhang, X., Reyes García, C.A., Zhang, L. (eds.) ICIC 2010. LNCS (LNAI), vol. 6216, pp. 634–640. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14932-0_78
Chapter Google Scholar
Nguyen, N., Guo, Y.: Comparisons of sequence labeling algorithms and extensions. In: Proceedings of the 24th International Conference on Machine Learning, pp. 681–688 (2007)
Google Scholar
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. arXiv preprint arXiv:1603.01354 (2016)
Souza, F., Nogueira, R., Lotufo, R.: Portuguese named entity recognition using BERT-CRF. arXiv preprint arXiv:1909.10649 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z.: Pre-training with whole word masking for Chinese BERT. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 35043514 (2021)
Article Google Scholar
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922 (2020)
Cui, Y., Yang, Z., Liu, T.: PERT: pre-training BERT with permuted language model. arXiv preprint arXiv:2203.06906 (2022)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

Download references

Acknowledgements

This work is supported by grants from the Fundamental Research Funds for the Central Universities (No. DUT22ZD205).

Author information

Authors and Affiliations

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
Dinghao Pan, Zhihao Yang, Haixin Tan, Jiangming Wu & Hongfei Lin

Authors

Dinghao Pan
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Haixin Tan
View author publications
You can also search for this author in PubMed Google Scholar
Jiangming Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhihao Yang .

Editor information

Editors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Wei Lu
Nanjing University, Nanjing, China
Shujian Huang
Soochow University, Suzhou, China
Yu Hong
Soochow University, Soochow, China
Xiabing Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, D., Yang, Z., Tan, H., Wu, J., Lin, H. (2022). Dialogue Topic Extraction as Sentence Sequence Labeling. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13552. Springer, Cham. https://doi.org/10.1007/978-3-031-17189-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-17189-5_21
Published: 24 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17188-8
Online ISBN: 978-3-031-17189-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Dialogue Topic Extraction as Sentence Sequence Labeling