Skip to main content
Log in

A Heterogeneous Interaction Graph Network for Multi-Intent Spoken Language Understanding

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

As the core component of intelligent dialogue systems, spoken language understanding (SLU) usually includes two tasks: intent detection and slot filling. In real-world scenarios, users may express multiple intents in an utterance, and a token-level slot label can belong to multiple intents. Intent detection and slot filling tasks are closely related and instruct each other. In this paper, we propose the heterogeneous interaction graph framework with window mechanism for joint multi-intent detection and slot filling, which can adequately capture the rich semantic information of different granularity in heterogeneous information. We leverage different types of nodes and edges to construct the heterogeneous graph to realize the interaction between coarse-grained sentence-level intent information and fine-grained word-level slot information. And we utilize window mechanism to accommodate the temporal locality of the slot information. Experimental results on two datasets show that our model achieves the state-of-the-art performance. Comprehensive analysis empirically verifies the effectiveness of each component.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availibility Statement

MixATIS and MixSNIPS datasets: https://github.com/LooperXX/AGIF.

References

  1. Li X, Chen Y-N, Li L, Gao J, Celikyilmaz A (2017) End-to-end task-completion neural dialogue systems. In: Proceedings of the eighth international joint conference on natural language processing (vol 1: Long Papers), pp 733–743

  2. Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explorat Newsl 19(2):25–35

    Article  Google Scholar 

  3. Ni J, Young T, Pandelea V, Xue F, Adiga V, Cambria E (2021) Recent advances in deep learning based dialogue systems: a systematic survey. arXiv:2105.04387

  4. Qin L, Liu T, Che W, Kang B, Zhao S, Liu T (2021) A co-interactive transformer for joint slot filling and intent detection. In: ICASSP 2021 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 8193–8197. IEEE

  5. Zhang L, Ma D, Zhang X, Yan X, Wang H (2020) Graph lstm with context-gated mechanism for spoken language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 9539–9546

  6. Haihong E, Niu P, Chen Z, Song M (2019) A novel bi-directional interrelated model for joint intent detection and slot filling. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5467–5471

  7. Gangadharaiah R, Narayanaswamy B (2019) Joint multiple intent detection and slot labeling for goal-oriented dialog. In: Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 564–569

  8. Qin L, Xu X, Che W, Liu T (2020) Agif: An adaptive graph-interactive framework for joint multiple intent detection and slot filling. In: Findings of the association for computational linguistics: EMNLP 2020, pp 1807–1816

  9. Qin L, Wei F, Xie T, Xu X, Che W, Liu T (2021) Gl-gin: Fast and accurate non-autoregressive model for joint multiple intent detection and slot filling. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (vol 1: Long Papers), pp 178–188

  10. Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, Yu PS (2019) Heterogeneous graph attention network. In: The world wide web conference, pp 2022–2032

  11. Qin L, Xie T, Che W, Liu T (2021) A survey on spoken language understanding: Recent advances and new frontiers. In: IJCAI

  12. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  13. Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: International conference on learning representations

  14. Shi C, Li Y, Zhang J, Sun Y, Philip SY (2016) A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng 29(1):17–37

    Article  Google Scholar 

  15. Hemphill CT, Godfrey JJ, Doddington GR (1990) The atis spoken language systems pilot corpus. In: Speech and natural language: proceedings of a workshop held at hidden valley, Pennsylvania, June 24–27, 1990

  16. Coucke A, Saade A, Ball A, Bluche T, Caulier A, Leroy D, Doumouro C, Gisselbrecht T, Caltagirone F, Lavril T et al (2018) Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190

  17. Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE/ACM Trans Audio Speech Language Process 22(4):778–784

    Article  Google Scholar 

  18. Lee JY, Dernoncourt F (2016) Sequential short-text classification with recurrent and convolutional neural networks. In: Proceedings of NAACL-HLT, pp 515–520

  19. Zhan L-M, Liang H, Liu B, Fan L, Wu X-M, Lam AY (2021) Out-of-scope intent detection with self-supervision and discriminative training. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (vol 1: Long Papers), pp 3521–3532

  20. Zhang J, Bui T, Yoon S, Chen X, Liu Z, Xia C, Tran QH, Chang W, Philip SY (2021) Few-shot intent detection via contrastive pre-training and fine-tuning. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 1906–1912

  21. Sarikaya R, Hinton GE, Ramabhadran B (2011) Deep belief nets for natural language call-routing. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5680–5683. IEEE

  22. Deoras A, Sarikaya R (2013) Deep belief network based semantic taggers for spoken language understanding. In: Interspeech, pp 2713–2717

  23. Wang L, Li X, Liu J, He K, Yan Y, Xu W (2021) Bridge to target domain by prototypical contrastive learning and label confusion: re-explore zero-shot learning for slot filling. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 9474–9480

  24. Glass M, Rossiello G, Chowdhury MFM, Gliozzo A (2021) Robust retrieval augmented generation for zero-shot slot filling. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 1939–1949

  25. Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y-N, Gao J, Deng L, Wang Y-Y (2016) Multi-domain joint semantic frame parsing using bi-directional rnn-lstm. In: Interspeech, pp 715–719

  26. Zhang C, Li Y, Du N, Fan W, Philip SY (2019) Joint slot filling and intent detection via capsule neural networks. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5259–5267

  27. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  28. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  29. Goo C-W, Gao G, Hsu Y-K, Huo C-L, Chen T-C, Hsu K-W, Chen Y-N (2018) Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: human language technologies, volume 2 (Short Papers), pp 753–757

  30. Wang Y, Shen Y, Jin H (2018) A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In: Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: human language technologies, vol 2 (Short Papers), pp 309–314

  31. Qin L, Che W, Li Y, Wen H, Liu T (2019) A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2078–2087

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qichen Zhang.

Ethics declarations

Conflict of interest

No conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication. I would like to declare on behalf of my co-authors that the work described was original research that has not been published previously, and not under consideration for publication elsewhere, in whole or in part. All the authors listed have approved the manuscript that is enclosed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Wang, S. & Li, J. A Heterogeneous Interaction Graph Network for Multi-Intent Spoken Language Understanding. Neural Process Lett 55, 9483–9501 (2023). https://doi.org/10.1007/s11063-023-11210-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11210-7

Keywords

Navigation