Abstract
Recent works in dialogue state tracking (DST) focus on a handful of languages, as collecting large-scale manually annotated data in different languages is expensive. Existing models address this issue by code-switched data augmentation or intermediate fine-tuning of multilingual pre-trained models. However, these models can only perform implicit alignment across languages. In this paper, we propose a novel model named Contrastive Learning for Cross-Lingual DST (CLCL-DST) to enhance zero-shot cross-lingual adaptation. Specifically, we use a self-built bilingual dictionary for lexical substitution to construct multilingual views of the same utterance. Then our approach leverages fine-grained contrastive learning to encourage representations of specific slot tokens in different views to be more similar than negative example pairs. By this means, CLCL-DST aligns similar words across languages into a more refined language-invariant space. In addition, CLCL-DST uses a significance-based keyword extraction approach to select task-related words to build the bilingual dictionary for better cross-lingual positive examples. Experiment results on Multilingual WoZ 2.0 and parallel MultiWoZ 2.1 datasets show that our proposed CLCL-DST outperforms existing state-of-the-art methods by a large margin, demonstrating the effectiveness of CLCL-DST.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, W., et al.: XL-NBT: a cross-lingual neural belief tracking framework. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 414–424 (2018)
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451 (2020)
Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)
Eric, M., et al.: MultiWOZ 2.1: a consolidated multi-domain dialogue dataset with state corrections and state tracking baselines. In: Proceedings of the Twelfth Language Resources and Evaluation Conference, pp. 422–428 (2020)
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, pp. 6894–6910. Association for Computational Linguistics (ACL) (2021)
Goel, R., Paul, S., Hakkani-Tür, D.: HyST: a hybrid approach for flexible and accurate dialogue state tracking. Proc. Interspeech 2019, 1458–1462 (2019)
Gunasekara, C.: Overview of the ninth dialog system technology challenge: DSTC9. In: DSTC9 Workshop at AAAI 2021 (2021)
Kim, S., Yang, S., Kim, G., Lee, S.W.: Efficient dialogue state tracking by selectively overwriting memory. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 567–582 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lai, S., Huang, H., Jing, D., Chen, Y., Xu, J., Liu, J.: Saliency-based multi-view mixed language training for zero-shot cross-lingual classification. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 599–610 (2021)
Lample, G., Conneau, A., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. In: International Conference on Learning Representations (2018)
Lee, C.H., Cheng, H., Ostendorf, M.: Dialogue state tracking with a language model using schema-driven prompting. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 4937–4949 (2021)
Lee, H., Lee, J., Kim, T.Y.: SUMBT: slot-utterance matching for universal and scalable belief tracking. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5478–5483 (2019)
Lin, W., Tseng, B.H., Byrne, B.: Knowledge-aware graph-enhanced GPT-2 for dialogue state tracking. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7871–7881 (2021)
Lin, Y.T., Chen, Y.N.: An empirical study of cross-lingual transferability in generative dialogue state tracker. arXiv preprint arXiv:2101.11360 (2021)
Liu, Z., Winata, G.I., Lin, Z., Xu, P., Fung, P.: Attention-informed mixed-language training for zero-shot cross-lingual task-oriented dialogue systems. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8433–8440 (2020)
Liu, Z., Winata, G.I., Xu, P., Lin, Z., Fung, P.: Cross-lingual spoken language understanding with regularized representation alignment. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7241–7251 (2020)
Moghe, N., Steedman, M., Birch, A.: Cross-lingual intermediate fine-tuning improves dialogue state tracking. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 1137–1150 (2021)
Mrkšić, N., et al.: Semantic specialization of distributional word vector spaces using monolingual and cross-lingual constraints. Trans. Assoc. Comput. Linguist. 5, 309–324 (2017)
Qin, L., et al.: GL-CleF: a global-local contrastive learning framework for cross-lingual spoken language understanding. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2677–2686 (2022)
Qin, L., Ni, M., Zhang, Y., Che, W.: CoSDA-ML: multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 3853–3860 (2021)
Ren, L., Ni, J., McAuley, J.: Scalable and accurate dialogue state tracking via hierarchical sequence generation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1876–1885 (2019)
Schuster, S., Gupta, S., Shah, R., Lewis, M.: Cross-lingual transfer learning for multilingual task oriented dialog. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 3795–3805 (2019)
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1073–1083 (2017)
Wang, D., Ding, N., Li, P., Zheng, H.: CLINE: contrastive learning with semantic negative examples for natural language understanding. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 2332–2342 (2021)
Wang, Y., Zhao, J., Bao, J., Duan, C., Wu, Y., He, X.: LUNA: learning slot-turn alignment for dialogue state tracking. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3319–3328 (2022)
Wen, T.H., et al.: A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp. 438–449 (2017)
Wu, C.S., Madotto, A., Hosseini-Asl, E., Xiong, C., Socher, R., Fung, P.: Transferable multi-domain state generator for task-oriented dialogue systems. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 808–819 (2019)
Xue, L., et al.: mT5: a massively multilingual pre-trained text-to-text transformer. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 483–498 (2021)
Yuan, M., Zhang, M., Van Durme, B., Findlater, L., Boyd-Graber, J.: Interactive refinement of cross-lingual word embeddings. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 5984–5996 (2020)
Zeng, Y., Nie, J.Y.: Jointly optimizing state operation prediction and value generation for dialogue state tracking. arXiv preprint arXiv:2010.14061 (2020)
Zeng, Y., Nie, J.Y.: Multi-domain dialogue state tracking based on state graph. arXiv preprint arXiv:2010.11137 (2020)
Zhong, V., Xiong, C., Socher, R.: Global-locally self-attentive encoder for dialogue state tracking. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1458–1467 (2018)
Acknowledgement
The research work descried in this paper has been supported by the National Key R &D Program of China (2020AAA0108005), the National Nature Science Foundation of China (No. 61976015, 61976016, 61876198 and 61370130) and Toshiba (China) Co., Ltd. The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xiang, Y. et al. (2023). Improving Zero-Shot Cross-Lingual Dialogue State Tracking via Contrastive Learning. In: Sun, M., et al. Chinese Computational Linguistics. CCL 2023. Lecture Notes in Computer Science(), vol 14232. Springer, Singapore. https://doi.org/10.1007/978-981-99-6207-5_8
Download citation
DOI: https://doi.org/10.1007/978-981-99-6207-5_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6206-8
Online ISBN: 978-981-99-6207-5
eBook Packages: Computer ScienceComputer Science (R0)