A domain-aware model with multi-perspective contrastive learning for natural language understanding

Wang, Di; Ni, Qingjian

doi:10.1007/s10489-024-06154-x

A domain-aware model with multi-perspective contrastive learning for natural language understanding

Published: 24 December 2024

Volume 55, article number 218, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

107 Accesses
Explore all metrics

Abstract

Intent detection and slot filling are core tasks in natural language understanding (NLU) for task-oriented dialogue systems. However, current models face challenges with numerous intent categories, slot types, and domain classifications, alongside a shortage of well-annotated datasets, particularly in Chinese. Therefore, we propose a domain-aware model with multi-perspective, multi-positive contrastive learning. First, we adopt a self-supervised contrastive learning with multiple perspectives and multiple positive instances, which is capable of spacing the vectors of positive and negative instances from the domain, intent, and slot perspectives, and fusing more positive instance information to increase the classification effectiveness of the model. Our proposed domain-aware model defines domain-level units at the decoding layer, allowing the model to predict intent and slot information based on domain features, which greatly reduces the search space for intent and slot. In addition, we design a dual-stage attention mechanism for capturing implicitly shared information between intents and slots. We propose a data augmentation method that adds noise to the embedding layer, applies fine-grained augmentation techniques, and filters biased samples based on a similarity threshold. Our model is applied to real task-oriented dialogue systems and compared with other NLU models. Experimental results demonstrate that our proposed model outperforms other models in terms of NLU performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial Shared-Private Attention Network for Joint Slot Filling and Intent Detection

Joint Spoken Language Understanding and Domain Adaptive Language Modeling

Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Article 18 November 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability and access

Data available on request from the authors. The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

Chen Q, Zhuo Z, Wang W (2019) Bert for joint intent classification and slot filling. arXiv:1902.10909
Liu B, Lane I (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. Interspeech 2016
Zhang X, Wang H (2016) A joint model of intent determination and slot filling for spoken language understanding. In: IJCAI, vol 16, pp 2993–2999
Cao X, Xiong D, Shi C, Wang C, Meng Y, Hu C (2020) Balanced joint adversarial training for robust intent detection and slot filling. In: Proceedings of the 28th international conference on computational linguistics, pp 4926–4936
Goo C-W, Gao G, Hsu Y-K, Huo C-L, Chen T-C, Hsu K-W, Chen Y-N (2018) Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 2 (Short Papers), pp. 753–757
Li C, Li L, Qi J (2018) A self-attentive model with gate mechanism for spoken language understanding. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3824–3833
Qin L, Che W, Li Y, Wen H, Liu T (2019) A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of the 2019 Conference on Empirical Methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2078–2087
Qin L, Liu T, Che W, Kang B, Zhao S, Liu T (2021) A co-interactive transformer for joint slot filling and intent detection. In: ICASSP 2021-2021 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 8193–8197
Yang P, Ji D, Ai C, Li B (2021) Aise: Attending to intent and slots explicitly for better spoken language understanding. Knowl-Based Syst 211:106537
Article Google Scholar
Hao X, Wang L, Zhu H, Guo X (2023) Joint agricultural intent detection and slot filling based on enhanced heterogeneous attention mechanism. Comput Electron Agric 207:107756
Article Google Scholar
Tur G, Hakkani-Tür D, Heck L (2010) What is left to be understood in Atis? In: 2010 IEEE Spoken language technology workshop, pp 19–24. IEEE
Coucke A, Saade A, Ball A, Bluche T, Caulier A, Leroy D, Doumouro C, Gisselbrecht T, Caltagirone F, Lavril T et al (2018) Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190
Haffner P, Tur G, Wright JH (2003) Optimizing svms for complex call classification. In: 2003 IEEE International conference on acoustics, speech, and signal processing, 2003. Proceedings.(ICASSP’03)., vol. 1, p. IEEE
Schapire RE, Singer Y (2000) Boostexter: A boosting-based system for text categorization. Mach Learn 39:135–168
Article Google Scholar
Mikolov T, Karafiát M, Burget L, Cernockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Interspeech, vol. 2, Makuhari, pp 1045–1048
Chen Y (2015) Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with em routing. In: International conference on learning representations
Xia C, Zhang C, Yan X, Chang Y, Philip SY (2018) Zero-shot user intent detection via capsule neural networks. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3090–3099
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Wang C, Liu X, Chen Z, Hong H, Tang J, Song D (2022) Deepstruct: Pretraining of language models for structure prediction. In: Findings of the association for computational linguistics: ACL 2022, pp 803–823
Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y-N, Gao J, Deng L, Wang Y-Y (2016) Multi-domain joint semantic frame parsing using bi-directional rnn-lstm. In: Interspeech, pp 715–719
Phuong NM, Le T, Minh NL (2022) Cae: mechanism to diminish the class imbalanced in slu slot filling task. In: International conference on computational collective intelligence. Springer, pp 150–163
Zhang L, Ma D, Zhang X, Yan X, Wang H (2020) Graph lstm with context-gated mechanism for spoken language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 9539–9546
Dao MH, Truong TH, Nguyen DQ (2021) Intent detection and slot filling for Vietnamese. arXiv:2104.02021
Wu D, Ding L, Lu F, Xie J (2020) Slotrefine: A fast non-autoregressive model for joint intent detection and slot filling. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1932–1937
Haihong E, Niu P, Chen Z, Song M (2019) A novel bi-directional interrelated model for joint intent detection and slot filling. In: Proceedings of the 57th Annual meeting of the association for computational linguistics, pp 5467–5471
Rafiepour M, Sartakhti JS (2023) Ctran: Cnn-transformer-based network for natural language understanding. Eng Appl Artif Intell 126:107013
Article Google Scholar
Tu NA, Hieu DX, Phuong TM, Bach NX (2023) A bidirectional joint model for spoken language understanding. In: ICASSP 2023-2023 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1–5
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607
Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284
Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 15750–15758
Tian Y, Fan L, Isola P, Chang H, Krishnan D (2024) Stablerep: Synthetic images from text-to-image models make strong visual representation learners. Adv Neural Inf Process Syst 36
Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J et al (2021) Learning transferable visual models from natural language supervision. In: International conference on machine learning. PMLR, pp 8748–8763
Yan Y, Li R, Wang S, Zhang F, Wu W, Xu WC (2021) A contrastive framework for self-supervised sentence representation transfer. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol. 1
Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. In: 2021 Conference on empirical methods in natural language processing, EMNLP 2021, pp 6894–6910. Association for Computational Linguistics (ACL)
Jain N, Chiang P-Y, Wen Y, Kirchenbauer J, Chu H-M, Somepalli G, Bartoldson BR, Kailkhura B, Schwarzschild A, Saha A et al (2023) Neftune: Noisy embeddings improve instruction finetuning. In: The Twelfth international conference on learning representations
Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Küttler H, Lewis M, Yih W-T, Rocktäschel T et al (2020) Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inf Process Syst 33:9459–9474
Google Scholar

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (12273003).

Author information

Authors and Affiliations

College of Software Engineering, Southeast University, Nanjing, 211189, China
Di Wang & Qingjian Ni
Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, Nanjing, China
Qingjian Ni

Authors

Di Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qingjian Ni
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Di Wang: Methodology, Validation, Formal analysis, Writing - original draft, Visualization. Qingjian Ni: Writing - review & editing.

Corresponding author

Correspondence to Qingjian Ni.

Ethics declarations

Competing Interests

The authors declare that they have no conflict of interest.

Ethical and informed consent for data used

This paper hereby states that all data used in this study were obtained in an ethical manner and in accordance with informed consent protocols. The dataset used in this study was obtained from NIO. The use of this dataset for academic or research purposes does not violate any copyright, intellectual property or data protection legislation.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, D., Ni, Q. A domain-aware model with multi-perspective contrastive learning for natural language understanding. Appl Intell 55, 218 (2025). https://doi.org/10.1007/s10489-024-06154-x

Download citation

Accepted: 04 December 2024
Published: 24 December 2024
DOI: https://doi.org/10.1007/s10489-024-06154-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A domain-aware model with multi-perspective contrastive learning for natural language understanding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adversarial Shared-Private Attention Network for Joint Slot Filling and Intent Detection

Joint Spoken Language Understanding and Domain Adaptive Language Modeling

Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A domain-aware model with multi-perspective contrastive learning for natural language understanding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adversarial Shared-Private Attention Network for Joint Slot Filling and Intent Detection

Joint Spoken Language Understanding and Domain Adaptive Language Modeling

Multilingual mixture attention interaction framework with adversarial training for cross-lingual SLU

Explore related subjects

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation