Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Zhang, Qichen; Wang, Shuai; Li, Jingmei

doi:10.1007/s00521-023-09132-5

Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Original Article
Published: 18 November 2023

Volume 36, pages 1915–1930, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

92 Accesses
Explore all metrics

Abstract

Cross-lingual spoken language understanding (cross-lingual SLU), as a key component of task-oriented dialogue systems, is widely used in various industrial and real-world scenarios, such as multilingual customer support systems, cross-border communication platforms, and international language learning tools. However, obtaining large-scale and high-quality datasets for SLU is challenging due to the high cost of dialogue collection and manual annotation, particularly for minority languages. As a result, there is increasing interest in leveraging high-resource language data for cross-lingual transfer learning. Existing approaches for zero-shot cross-lingual SLU primarily focus on the relationship between the source language sentence and the single generated cross-lingual sentence, disregarding the shared information among multiple languages. This limitation weakens the robustness of multilingual word embedding representations and hampers the scalability of the model. In this paper, we propose the multilingual mixture attention interaction framework with adversarial training to alleviate the above problems. Specifically, we leverage the source language sentence to generate multiple multilingual hybrid sentences, in which words can adaptively capture unambiguous representations from the aligned multilingual words during the encoding phase, and adversarial training is introduced to enhance the scalability of the model. Then, we incorporate the symmetric kernel self-attention module with positional embedding to learn contextual information within a sentence, and employ the multi-relation graph convolutional networks to learn different granularity information between two highly correlated intent detection and slot filling tasks. Experimental results on the public dataset MultiATIS++ demonstrate that our proposed model achieves state-of-the-art performance, and comprehensive analysis validates the effectiveness of each component.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Transformer models for text-based emotion detection: a review of BERT-based approaches

Article 08 February 2021

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Article Open access 17 February 2024

Pre-trained models for natural language processing: A survey

Article 15 September 2020

Availability of data and material

MultiATIS++ [9]: https://github.com/amazon-research/multiatis.

References

Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explor Newsl 19(2):25–35
Article Google Scholar
Ni J, Young T, Pandelea V, Xue F, Cambria E (2023) Recent advances in deep learning based dialogue systems: a systematic survey. Artif Intell Rev 56(4):3055–3155
Article Google Scholar
Tur G, Hakkani-Tür D, Heck L (2010) What is left to be understood in ATIS?. In: 2010 IEEE spoken language technology workshop. IEEE, pp 19–24
Tur G, De Mori R (2011) Spoken language understanding: systems for extracting semantic information from speech. John Wiley & Sons, Hoboken
Book Google Scholar
Haihong E, Niu P, Chen Z, Song M (2019) A novel bi-directional interrelated model for joint intent detection and slot filling. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5467–5471
Qin L, Xie T, Che W, Liu T (2021) A survey on spoken language understanding: recent advances and new frontiers. In: Proceedings of the thirtieth international joint conference on artificial intelligence survey track
Sun M, Huang K, Moradshahi M (2021) Investigating effect of dialogue history in multilingual task oriented dialogue systems. arXiv preprint arXiv:2112.12318
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (Long and Short Papers), pp 4171–4186
Xu W, Haider B, Mansour S (2020) End-to-end slot alignment and recognition for cross-lingual nlu. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 5052–5063
Liu Z, Winata GI, Lin Z, Xu P, Fung PN (2020) Attention-informed mixed-language training for zero-shot cross-lingual task-oriented dialogue systems. In: Proceedings of the AAAI conference on artificial intelligence
Qin L, Ni M, Zhang Y, Che W (2021) CoSDA-ML: multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence, pp 3853–3860
Qin L, Chen Q, Xie T, Li Q, Lou J-G, Che W, Kan M-Y (2022) GL-CLeF: a global–local contrastive learning framework for cross-lingual spoken language understanding. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 2677–2686
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations
De Cao N, Aziz W, Titov I (2019) Question answering by reasoning across documents with graph convolutional networks. In: 2019 annual conference of the North American chapter of the association for computational linguistics. Association for Computational Linguistics, pp 2306–2317
Tu M, Wang G, Huang J, Tang Y, He X, Zhou B (2019) Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 2704–2713
Li X, Chen Y-N, Li L, Gao J, Celikyilmaz A (2017) End-to-end task-completion neural dialogue systems. In: Proceedings of the eighth international joint conference on natural language processing (Volume 1: Long Papers), pp 733–743
Zhang X, Wang H (2016) A joint model of intent determination and slot filling for spoken language understanding. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, pp 2993–2999
Liu B, Lane I (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. Interspeech 2016:685–689
Google Scholar
Goo C-W, Gao G, Hsu Y-K, Huo C-L, Chen T-C, Hsu K-W, Chen Y-N (2018) Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (Short Papers), pp 753–757
Li C, Li L, Qi J (2018) A self-attentive model with gate mechanism for spoken language understanding. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3824–3833
Zhang L, Ma D, Zhang X, Yan X, Wang H (2020) Graph LSTM with context-gated mechanism for spoken language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 9539–9546
Liu Y, Meng F, Zhang J, Zhou J, Chen Y, Xu J (2019) CM-NeT: a novel collaborative memory network for spoken language understanding. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 1051–1060
Qin L, Liu T, Che W, Kang B, Zhao S, Liu T (2021) A co-interactive transformer for joint slot filling and intent detection. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 8193–8197
Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave É, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 8440–8451
Guarasci R, Silvestri S, De Pietro G, Fujita H, Esposito M (2022) Bert syntactic transfer: a computational experiment on Italian, French and English languages. Comput Speech Lang 71:101261
Article Google Scholar
Lample G, Conneau A, Ranzato M, Denoyer L, Jégou H (2018) Word translation without parallel data. In: International conference on learning representations
Chen X, Cardie C (2018) Unsupervised multilingual word embeddings. In: Proceedings of the 2018 conference on empirical methods in natural language processing
Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Proceedings of the 33rd international conference on neural information processing systems, pp 7059–7069
Mulcaire P, Kasai J, Smith NA (2019) Polyglot contextual representations improve crosslingual transfer. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (Long and Short Papers), pp 3912–3918
Liu NF, Gardner M, Belinkov Y, Peters ME, Smith NA (2019) Linguistic knowledge and transferability of contextual representations. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (Long and Short Papers), pp 1073–1094
Ahmad W, Li H, Chang K-W, Mehdad Y (2021) Syntax-augmented multilingual BERT for cross-lingual transfer. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 4538–4554
Xu D, Li J, Zhu M, Zhang M, Zhou G (2021) XLPT-AMR: cross-lingual pre-training via multi-task learning for zero-shot AMR parsing and text generation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 896–907
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples
Sato M, Suzuki J, Shindo H, Matsumoto Y (2018) Interpretable adversarial perturbation in input embedding space for text. In: 27th international joint conference on artificial intelligence, IJCAI 2018, pp 4323–4330
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations
Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. In: International conference on learning representations
Tsai Y-HH, Bai S, Yamada M, Morency L-P, Salakhutdinov R (2019) Transformer dissection: an unified understanding for transformer’s attention via the lens of kernel. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4344–4353
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Zhu Q, Khan H, Soltan S, Rawls S, Hamza W (2020) Don’t parse, insert: multilingual semantic parsing with insertion based decoding. In: Proceedings of the 24th conference on computational natural language learning, pp 496–506

Download references

Author information

Authors and Affiliations

College of Computer Science and technology, Harbin Engineering University, No.145 Nantong Street, Nangang District, Harbin, 150001, Heilongjiang, China
Qichen Zhang, Shuai Wang & Jingmei Li

Authors

Qichen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jingmei Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qichen Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Q., Wang, S. & Li, J. Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU. Neural Comput & Applic 36, 1915–1930 (2024). https://doi.org/10.1007/s00521-023-09132-5

Download citation

Received: 04 January 2023
Accepted: 16 October 2023
Published: 18 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00521-023-09132-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Abstract

Access this article

Similar content being viewed by others

Transformer models for text-based emotion detection: a review of BERT-based approaches

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Pre-trained models for natural language processing: A survey

Availability of data and material

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Abstract

Access this article

Similar content being viewed by others

Transformer models for text-based emotion detection: a review of BERT-based approaches

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Pre-trained models for natural language processing: A survey

Availability of data and material

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation