Skip to main content

Dialogue Topic Extraction as Sentence Sequence Labeling

  • Conference paper
  • First Online:
Book cover Natural Language Processing and Chinese Computing (NLPCC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13552))

  • 858 Accesses

Abstract

The topic information of the dialogue text is important for the model to understand the intentions of the dialogue participants and to abstractly summarize the content of the dialogue. The dialogue topic extraction task aims to extract the evolving topic information in long dialogue texts. In this work, we focus on topic extraction of dialogue texts in customer service scenarios. Based on the rich sequence features in the topic tags, we define this task as a sequence labeling task with sentences as the basic elements. For this task, we build a dialogue topic extraction system using a Chinese pre-trained language model and a CRF model. In addition, we use sliding windows to avoid excessive loss of contextual information, and use adversarial training and model integration to improve the performance and robustness of our model. Our system ranks first on the track 1 of the NLPCC-2022 shared task on Dialogue Text Analysis, Topic Extraction and Dialogue Summary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Trivedi, A., Pant, N., Shah, P., Sonik, S., Agrawal, S.: Speech to text and text to speech recognition systems-a review. IOSR J. Comput. Eng 20(2), 3643 (2018)

    Google Scholar 

  2. Wang, L., Yao, J., Tao, Y., Zhong, L., Liu, W., Du, Q.: A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. arXiv preprint arXiv:1805.03616 (2018)

  3. Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization. arXiv preprint arXiv:1808.08745 (2018)

  4. Wang, J., et al.: Sentiment classification in customer service dialogue with topic-aware multi-task learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9177–9184 (2020)

    Google Scholar 

  5. Xu, B., Wang, Q., Lyu, Y., Zhu, Y., Mao, Z.: Entity structure within and through- out: modeling mention dependencies for document-level relation extraction (2021)

    Google Scholar 

  6. Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725 (2016)

  7. Jung, Y.: Multiple predicting K-fold cross-validation for model selection. J. Nonparametr. Stat. 30(1), 197215 (2018)

    Article  MathSciNet  Google Scholar 

  8. Luo, L., Lai, P.T., Wei, C.H., Lu, Z.: Extracting drug-protein interaction using an ensemble of biomedical pre-trained language models through sequence labeling and text classification techniques. In: Proceedings of the BioCreative VII Challenge Evaluation Workshop, pp. 26–30 (2021)

    Google Scholar 

  9. Liu, J., et al.: Topic- aware contrastive learning for abstractive dialogue summarization. arXiv preprint arXiv:2109.04994 (2021)

  10. Xu, Y., Zhao, H., Zhang, Z.: Topicaware multi-turn dialogue modeling. In: The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) (2021)

    Google Scholar 

  11. Zou, Y., et al.: Topic-oriented spoken dialogue summarization for customer service with saliency-aware topic modeling. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14665–14673 (2021)

    Google Scholar 

  12. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  13. Kumawat, D., Jain, V.: POS tagging approaches: a comparison. Int. J. Comput. Appl. 118(6), 32–38 (2015)

    Google Scholar 

  14. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

  15. Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016)

  16. JingHui, X., BingQuan, L., XiaoLong, W.: Principles of non-stationary hidden markov model and its applications to sequence labeling task. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 827–837. Springer, Heidelberg (2005). https://doi.org/10.1007/11562214_72

    Chapter  Google Scholar 

  17. Liu, Z., Zhu, C., Zhao, T.: Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words? In: Huang, D.-S., Zhang, X., Reyes García, C.A., Zhang, L. (eds.) ICIC 2010. LNCS (LNAI), vol. 6216, pp. 634–640. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14932-0_78

    Chapter  Google Scholar 

  18. Nguyen, N., Guo, Y.: Comparisons of sequence labeling algorithms and extensions. In: Proceedings of the 24th International Conference on Machine Learning, pp. 681–688 (2007)

    Google Scholar 

  19. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. arXiv preprint arXiv:1603.01354 (2016)

  20. Souza, F., Nogueira, R., Lotufo, R.: Portuguese named entity recognition using BERT-CRF. arXiv preprint arXiv:1909.10649 (2019)

  21. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  22. Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z.: Pre-training with whole word masking for Chinese BERT. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 35043514 (2021)

    Article  Google Scholar 

  23. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922 (2020)

  24. Cui, Y., Yang, Z., Liu, T.: PERT: pre-training BERT with permuted language model. arXiv preprint arXiv:2203.06906 (2022)

  25. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

Download references

Acknowledgements

This work is supported by grants from the Fundamental Research Funds for the Central Universities (No. DUT22ZD205).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhihao Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pan, D., Yang, Z., Tan, H., Wu, J., Lin, H. (2022). Dialogue Topic Extraction as Sentence Sequence Labeling. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13552. Springer, Cham. https://doi.org/10.1007/978-3-031-17189-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-17189-5_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-17188-8

  • Online ISBN: 978-3-031-17189-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics