Skip to main content
Log in

Hierarchical temporal slot interactions for dialogue state tracking

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Dialogue state tracking (DST), as an essential component of task-oriented dialogue systems, refers to keeping track of the user’s intentions as a conversation progresses. Typical methods formulate it as a classification task with fixed pre-defined slot-value pairs, or generate slot-value candidates given the dialogue history. Most of them have limitations on considering interactions of slots with utterance sentences and other slots progressively. To tackle this problem, we propose a Dialogue State Tracker with Hierarchical Temporal Slot Interactions (DST-HTSI) to capture slot-related semantic information from utterance sentences and slots. It firstly captures interactive information among slots within a turn and across turns by applying hierarchical slot interactions. Then a temporal slot interaction module is employed to establish slot dependencies along the time. Finally, a GRU is applied as the decoder to generate values for each slot correspondingly. Furthermore, we also leverage pre-trained language models as the backbone of our model. Experiments show that DST-HTSI outperforms previous state-of-the-art on MultiWOZ 2.2 and WOZ 2.0, and achieves competitive results on MultiWOZ 2.1.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available in the multiwoz repository (https://github.com/budzianowski/multiwoz for MultiWOZ 2.1 and 2.2 datasets) and N2N-Dialogue-System repository (https://github.com/Yusser95/N2N-Dialogue-System for WOZ 2.0 dataset).

Notes

  1. Residual connection and layer normalization is omitted for simplicity.

  2. In this paper, PLMs particularly refers to pre-trained transformers such as BERT [19] and GPT-2 [37].

  3. We call such slot pairs as connected pairs for simplicity.

References

  1. Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. Acm Sigkdd Explor Newsl 19(2):25–35

    Article  Google Scholar 

  2. Williams JD, Raux A, Henderson M (2016) The dialog state tracking challenge series: a review. Dialogue Discourse 7(3):4–33

    Article  Google Scholar 

  3. Ni P, Li Y, Li G, Chang V (2020) Natural language understanding approaches based on joint task of intent detection and slot filling for IOT voice interaction. Neural Comput Appl 32(20):16149–16166

    Article  Google Scholar 

  4. Mrkšić N, Séaghdha D.Ó, Wen T-H, Thomson B, Young S (2017) Neural belief tracker: data-driven dialogue state tracking. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 1777–1788

  5. Zhong V, Xiong C, Socher R (2018) Global-locally self-attentive dialogue state tracker. arXiv preprint arXiv:1805.09655

  6. Nouri E, Hosseini-Asl E (2018) Toward scalable neural dialogue state tracking model. arXiv preprint arXiv:1812.00899

  7. Lee H, Lee J, Kim T-Y (2019) Sumbt: slot-utterance matching for universal and scalable belief tracking. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5478–5483

  8. Gao S, Sethi A, Agarwal S, Chung T, Hakkani-Tur D, AI AA (2019) Dialog state tracking: a neural reading comprehension approach. In: 20th annual meeting of the special interest group on discourse and dialogue, p 264

  9. Ren L, Xie K, Chen L, Yu K (2018) Towards universal dialogue state tracking. arXiv preprint arXiv:1810.09587

  10. Wu C-S, Madotto A, Hosseini-Asl E, Xiong C, Socher R, Fung P (2019) Transferable multi-domain state generator for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 808–819

  11. Quan J, Xiong D (2020) Modeling long context for task-oriented dialogue state generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7119–7124

  12. Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 1631–1640

  13. Kim S, Yang S, Kim G, Lee S-W (2020) Efficient dialogue state tracking by selectively overwriting memory. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 567–582

  14. Hu J, Yang Y, Chen C, Yu Z, et al. (2020) Sas: dialogue state tracking via slot attention and slot information sharing. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 6366–6375

  15. Ouyang Y, Chen M, Dai X, Zhao Y, Huang S, Jiajun C (2020) Dialogue state tracking with explicit slot connection modeling. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 34–40

  16. Thomson B, Young S (2010) Bayesian update of dialogue state: a POMDP framework for spoken dialogue systems. Comput Speech Lang 24(4):562–588

    Article  Google Scholar 

  17. Wang Z, Lemon O (2013) A simple and generic belief tracking mechanism for the dialog state tracking challenge: on the believability of observed information. In: Proceedings of the SIGDIAL 2013 conference, pp 423–432

  18. Henderson M, Thomson B, Young S (2014) Word-based dialog state tracking with recurrent neural networks. In: Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL), pp 292–299

  19. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  20. Heck M, van Niekerk C, Lubis N, Geishauser C, Lin H-C, Moresi M, Gasic M (2020) Trippy: a triple copy strategy for value independent neural dialog state tracking. In: Proceedings of the 21th annual meeting of the special interest group on discourse and dialogue, pp 35–44

  21. Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Proceedings of the 28th international conference on neural information processing systems, vol 2, pp 2692–2700

  22. Chen L, Lv B, Wang C, Zhu S, Tan B, Yu K (2020) Schema-guided multi-domain dialogue state tracking with graph attention neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 7521–7528

  23. Mehri S, Eric M, Hakkani-Tur D (2020) Dialoglue: a natural language understanding benchmark for task-oriented dialogue. arXiv e-prints, 2009

  24. Yang G, Wang X, Yuan C (2019) Hierarchical dialog state tracking with unknown slot values. Neural Process Lett 50(2):1611–1625

    Article  Google Scholar 

  25. Lee C-H, Cheng H, Ostendorf M (2021) Dialogue state tracking with a language model using schema-driven prompting. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 4937–4949

  26. Tian X, Huang L, Lin Y, Bao S, He H, Yang Y, Wu H, Wang F, Sun S (2021) Amendable generation for dialogue state tracking. In: Proceedings of the 3rd workshop on natural language processing for conversational AI, pp 80–92

  27. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  28. Budzianowski P, Casanueva I, Tseng B, Gasic M (2018) Towards end-to-end multi-domain dialogue modelling

  29. Chen W, Chen J, Qin P, Yan X, Wang WY (2019) Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3696–3709

  30. Liu B, Lane I (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454

  31. Ma H, Wang J, Qian L, Lin H (2021) HAN-ReGRU: hierarchical attention network with residual gated recurrent unit for emotion recognition in conversation. Neural Comput Appl 33(7):2685–2703

    Article  Google Scholar 

  32. Kumar A, Ku P, Goyal A, Metallinou A, Hakkani-Tur D (2020) Ma-dst: Multi-attention-based scalable dialog state tracking. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8107–8114

  33. Zhu S, Li J, Chen L, Yu K (2020) Efficient context and schema fusion networks for multi-domain dialogue state tracking. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 766–781

  34. Dai Y, Li H, Li Y, Sun J, Huang F, Si L, Zhu X (2021) Preview, attend and review: schema-aware curriculum learning for multi-domain dialogue state tracking. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol 2: Short Papers, pp 879–885

  35. Feng Y, Lipani A, Ye F, Zhang Q, Yilmaz E (2022) Dynamic schema graph fusion network for multi-domain dialogue state tracking. In: Proceedings of the 60th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 115–126

  36. Feng Y, Wang Y, Li H (2021) A sequence-to-sequence approach to dialogue state tracking. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol 1: Long Papers, pp 1714–1725

  37. Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9

    Google Scholar 

  38. Bao S, He H, Wang F, Wu H, Wang H, Wu W, Guo Z, Liu Z, Xu X (2021) Plato-2: towards building an open-domain chatbot via curriculum learning. In: Findings of the association for computational linguistics: ACL-IJCNLP 2021, pp 2513–2525

  39. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67

    MathSciNet  MATH  Google Scholar 

  40. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:5998–6008

    Google Scholar 

  41. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  42. Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450

  43. Hendrycks D, Gimpel K (2016) Bridging nonlinearities and stochastic regularizers with Gaussian error linear units. arXiv preprint arXiv:1606.08415

  44. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 workshop on deep learning, December 2014

  45. Press O, Wolf L (2017) Using the output embedding to improve language models. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 2: Short Papers, pp 157–163

  46. Eric M, Goel R, Paul S, Sethi A, Agarwal S, Gao S, Hakkani-Tür D (2019) Multiwoz 2.1: Multi-domain dialogue state corrections and state tracking baselines

  47. Zang X, Rastogi A, Sunkara S, Gupta R, Zhang J, Chen J (2020) Multiwoz 2.2: a dialogue dataset with additional annotation corrections and state tracking baselines. In: Proceedings of the 2nd workshop on natural language processing for conversational AI, pp 109–117

  48. Wen T-H, Vandyke D, Mrkšić N, Gasic M, Barahona LMR, Su P-H, Ultes S, Young S (2017) A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 1: Long Papers, pp 438–449

  49. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  50. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  51. Pan B, Yang Y, Li B, Cai D (2021) Self-supervised attention flow for dialogue state tracking. Neurocomputing 440:279–286

    Article  Google Scholar 

  52. Zhang J, Hashimoto K, Wu C-S, Wang Y, Philip SY, Socher R, Xiong C (2020) Find or classify? dual strategy for slot-value predictions on multi-domain dialog state tracking. In: Proceedings of the ninth joint conference on lexical and computational semantics, pp 154–167

  53. Hosseini-Asl E, McCann B, Wu C-S, Yavuz S, Socher R (2020) A simple language model for task-oriented dialogue. Adv Neural Inf Process Syst 33:20179–20191

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haidong Zhang.

Ethics declarations

Conflict of interest

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiu, J., Lin, Z., Zhang, H. et al. Hierarchical temporal slot interactions for dialogue state tracking. Neural Comput & Applic 35, 5791–5805 (2023). https://doi.org/10.1007/s00521-022-07959-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07959-y

Keywords

Navigation