Skip to main content

Advertisement

A domain-aware model with multi-perspective contrastive learning for natural language understanding

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Intent detection and slot filling are core tasks in natural language understanding (NLU) for task-oriented dialogue systems. However, current models face challenges with numerous intent categories, slot types, and domain classifications, alongside a shortage of well-annotated datasets, particularly in Chinese. Therefore, we propose a domain-aware model with multi-perspective, multi-positive contrastive learning. First, we adopt a self-supervised contrastive learning with multiple perspectives and multiple positive instances, which is capable of spacing the vectors of positive and negative instances from the domain, intent, and slot perspectives, and fusing more positive instance information to increase the classification effectiveness of the model. Our proposed domain-aware model defines domain-level units at the decoding layer, allowing the model to predict intent and slot information based on domain features, which greatly reduces the search space for intent and slot. In addition, we design a dual-stage attention mechanism for capturing implicitly shared information between intents and slots. We propose a data augmentation method that adds noise to the embedding layer, applies fine-grained augmentation techniques, and filters biased samples based on a similarity threshold. Our model is applied to real task-oriented dialogue systems and compared with other NLU models. Experimental results demonstrate that our proposed model outperforms other models in terms of NLU performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability and access

Data available on request from the authors. The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

  1. Chen Q, Zhuo Z, Wang W (2019) Bert for joint intent classification and slot filling. arXiv:1902.10909

  2. Liu B, Lane I (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. Interspeech 2016

  3. Zhang X, Wang H (2016) A joint model of intent determination and slot filling for spoken language understanding. In: IJCAI, vol 16, pp 2993–2999

  4. Cao X, Xiong D, Shi C, Wang C, Meng Y, Hu C (2020) Balanced joint adversarial training for robust intent detection and slot filling. In: Proceedings of the 28th international conference on computational linguistics, pp 4926–4936

  5. Goo C-W, Gao G, Hsu Y-K, Huo C-L, Chen T-C, Hsu K-W, Chen Y-N (2018) Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 2 (Short Papers), pp. 753–757

  6. Li C, Li L, Qi J (2018) A self-attentive model with gate mechanism for spoken language understanding. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3824–3833

  7. Qin L, Che W, Li Y, Wen H, Liu T (2019) A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of the 2019 Conference on Empirical Methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2078–2087

  8. Qin L, Liu T, Che W, Kang B, Zhao S, Liu T (2021) A co-interactive transformer for joint slot filling and intent detection. In: ICASSP 2021-2021 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 8193–8197

  9. Yang P, Ji D, Ai C, Li B (2021) Aise: Attending to intent and slots explicitly for better spoken language understanding. Knowl-Based Syst 211:106537

    Article  Google Scholar 

  10. Hao X, Wang L, Zhu H, Guo X (2023) Joint agricultural intent detection and slot filling based on enhanced heterogeneous attention mechanism. Comput Electron Agric 207:107756

    Article  Google Scholar 

  11. Tur G, Hakkani-Tür D, Heck L (2010) What is left to be understood in Atis? In: 2010 IEEE Spoken language technology workshop, pp 19–24. IEEE

  12. Coucke A, Saade A, Ball A, Bluche T, Caulier A, Leroy D, Doumouro C, Gisselbrecht T, Caltagirone F, Lavril T et al (2018) Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190

  13. Haffner P, Tur G, Wright JH (2003) Optimizing svms for complex call classification. In: 2003 IEEE International conference on acoustics, speech, and signal processing, 2003. Proceedings.(ICASSP’03)., vol. 1, p. IEEE

  14. Schapire RE, Singer Y (2000) Boostexter: A boosting-based system for text categorization. Mach Learn 39:135–168

    Article  Google Scholar 

  15. Mikolov T, Karafiát M, Burget L, Cernockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Interspeech, vol. 2, Makuhari, pp 1045–1048

  16. Chen Y (2015) Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo

  17. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  18. Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with em routing. In: International conference on learning representations

  19. Xia C, Zhang C, Yan X, Chang Y, Philip SY (2018) Zero-shot user intent detection via capsule neural networks. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3090–3099

  20. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  21. Wang C, Liu X, Chen Z, Hong H, Tang J, Song D (2022) Deepstruct: Pretraining of language models for structure prediction. In: Findings of the association for computational linguistics: ACL 2022, pp 803–823

  22. Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y-N, Gao J, Deng L, Wang Y-Y (2016) Multi-domain joint semantic frame parsing using bi-directional rnn-lstm. In: Interspeech, pp 715–719

  23. Phuong NM, Le T, Minh NL (2022) Cae: mechanism to diminish the class imbalanced in slu slot filling task. In: International conference on computational collective intelligence. Springer, pp 150–163

  24. Zhang L, Ma D, Zhang X, Yan X, Wang H (2020) Graph lstm with context-gated mechanism for spoken language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 9539–9546

  25. Dao MH, Truong TH, Nguyen DQ (2021) Intent detection and slot filling for Vietnamese. arXiv:2104.02021

  26. Wu D, Ding L, Lu F, Xie J (2020) Slotrefine: A fast non-autoregressive model for joint intent detection and slot filling. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1932–1937

  27. Haihong E, Niu P, Chen Z, Song M (2019) A novel bi-directional interrelated model for joint intent detection and slot filling. In: Proceedings of the 57th Annual meeting of the association for computational linguistics, pp 5467–5471

  28. Rafiepour M, Sartakhti JS (2023) Ctran: Cnn-transformer-based network for natural language understanding. Eng Appl Artif Intell 126:107013

    Article  Google Scholar 

  29. Tu NA, Hieu DX, Phuong TM, Bach NX (2023) A bidirectional joint model for spoken language understanding. In: ICASSP 2023-2023 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1–5

  30. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738

  31. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607

  32. Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284

  33. Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 15750–15758

  34. Tian Y, Fan L, Isola P, Chang H, Krishnan D (2024) Stablerep: Synthetic images from text-to-image models make strong visual representation learners. Adv Neural Inf Process Syst 36

  35. Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J et al (2021) Learning transferable visual models from natural language supervision. In: International conference on machine learning. PMLR, pp 8748–8763

  36. Yan Y, Li R, Wang S, Zhang F, Wu W, Xu WC (2021) A contrastive framework for self-supervised sentence representation transfer. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol. 1

  37. Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. In: 2021 Conference on empirical methods in natural language processing, EMNLP 2021, pp 6894–6910. Association for Computational Linguistics (ACL)

  38. Jain N, Chiang P-Y, Wen Y, Kirchenbauer J, Chu H-M, Somepalli G, Bartoldson BR, Kailkhura B, Schwarzschild A, Saha A et al (2023) Neftune: Noisy embeddings improve instruction finetuning. In: The Twelfth international conference on learning representations

  39. Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Küttler H, Lewis M, Yih W-T, Rocktäschel T et al (2020) Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inf Process Syst 33:9459–9474

    Google Scholar 

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (12273003).

Author information

Authors and Affiliations

Authors

Contributions

Di Wang: Methodology, Validation, Formal analysis, Writing - original draft, Visualization. Qingjian Ni: Writing - review & editing.

Corresponding author

Correspondence to Qingjian Ni.

Ethics declarations

Competing Interests

The authors declare that they have no conflict of interest.

Ethical and informed consent for data used

This paper hereby states that all data used in this study were obtained in an ethical manner and in accordance with informed consent protocols. The dataset used in this study was obtained from NIO. The use of this dataset for academic or research purposes does not violate any copyright, intellectual property or data protection legislation.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, D., Ni, Q. A domain-aware model with multi-perspective contrastive learning for natural language understanding. Appl Intell 55, 218 (2025). https://doi.org/10.1007/s10489-024-06154-x

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06154-x

Keywords