Abstract
Dialogue state tracking (DST), as an essential component of task-oriented dialogue systems, refers to keeping track of the user’s intentions as a conversation progresses. Typical methods formulate it as a classification task with fixed pre-defined slot-value pairs, or generate slot-value candidates given the dialogue history. Most of them have limitations on considering interactions of slots with utterance sentences and other slots progressively. To tackle this problem, we propose a Dialogue State Tracker with Hierarchical Temporal Slot Interactions (DST-HTSI) to capture slot-related semantic information from utterance sentences and slots. It firstly captures interactive information among slots within a turn and across turns by applying hierarchical slot interactions. Then a temporal slot interaction module is employed to establish slot dependencies along the time. Finally, a GRU is applied as the decoder to generate values for each slot correspondingly. Furthermore, we also leverage pre-trained language models as the backbone of our model. Experiments show that DST-HTSI outperforms previous state-of-the-art on MultiWOZ 2.2 and WOZ 2.0, and achieves competitive results on MultiWOZ 2.1.
Similar content being viewed by others
Data availability
The datasets analyzed during the current study are available in the multiwoz repository (https://github.com/budzianowski/multiwoz for MultiWOZ 2.1 and 2.2 datasets) and N2N-Dialogue-System repository (https://github.com/Yusser95/N2N-Dialogue-System for WOZ 2.0 dataset).
References
Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. Acm Sigkdd Explor Newsl 19(2):25–35
Williams JD, Raux A, Henderson M (2016) The dialog state tracking challenge series: a review. Dialogue Discourse 7(3):4–33
Ni P, Li Y, Li G, Chang V (2020) Natural language understanding approaches based on joint task of intent detection and slot filling for IOT voice interaction. Neural Comput Appl 32(20):16149–16166
Mrkšić N, Séaghdha D.Ó, Wen T-H, Thomson B, Young S (2017) Neural belief tracker: data-driven dialogue state tracking. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 1777–1788
Zhong V, Xiong C, Socher R (2018) Global-locally self-attentive dialogue state tracker. arXiv preprint arXiv:1805.09655
Nouri E, Hosseini-Asl E (2018) Toward scalable neural dialogue state tracking model. arXiv preprint arXiv:1812.00899
Lee H, Lee J, Kim T-Y (2019) Sumbt: slot-utterance matching for universal and scalable belief tracking. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5478–5483
Gao S, Sethi A, Agarwal S, Chung T, Hakkani-Tur D, AI AA (2019) Dialog state tracking: a neural reading comprehension approach. In: 20th annual meeting of the special interest group on discourse and dialogue, p 264
Ren L, Xie K, Chen L, Yu K (2018) Towards universal dialogue state tracking. arXiv preprint arXiv:1810.09587
Wu C-S, Madotto A, Hosseini-Asl E, Xiong C, Socher R, Fung P (2019) Transferable multi-domain state generator for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 808–819
Quan J, Xiong D (2020) Modeling long context for task-oriented dialogue state generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7119–7124
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 1631–1640
Kim S, Yang S, Kim G, Lee S-W (2020) Efficient dialogue state tracking by selectively overwriting memory. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 567–582
Hu J, Yang Y, Chen C, Yu Z, et al. (2020) Sas: dialogue state tracking via slot attention and slot information sharing. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 6366–6375
Ouyang Y, Chen M, Dai X, Zhao Y, Huang S, Jiajun C (2020) Dialogue state tracking with explicit slot connection modeling. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 34–40
Thomson B, Young S (2010) Bayesian update of dialogue state: a POMDP framework for spoken dialogue systems. Comput Speech Lang 24(4):562–588
Wang Z, Lemon O (2013) A simple and generic belief tracking mechanism for the dialog state tracking challenge: on the believability of observed information. In: Proceedings of the SIGDIAL 2013 conference, pp 423–432
Henderson M, Thomson B, Young S (2014) Word-based dialog state tracking with recurrent neural networks. In: Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL), pp 292–299
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Heck M, van Niekerk C, Lubis N, Geishauser C, Lin H-C, Moresi M, Gasic M (2020) Trippy: a triple copy strategy for value independent neural dialog state tracking. In: Proceedings of the 21th annual meeting of the special interest group on discourse and dialogue, pp 35–44
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Proceedings of the 28th international conference on neural information processing systems, vol 2, pp 2692–2700
Chen L, Lv B, Wang C, Zhu S, Tan B, Yu K (2020) Schema-guided multi-domain dialogue state tracking with graph attention neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 7521–7528
Mehri S, Eric M, Hakkani-Tur D (2020) Dialoglue: a natural language understanding benchmark for task-oriented dialogue. arXiv e-prints, 2009
Yang G, Wang X, Yuan C (2019) Hierarchical dialog state tracking with unknown slot values. Neural Process Lett 50(2):1611–1625
Lee C-H, Cheng H, Ostendorf M (2021) Dialogue state tracking with a language model using schema-driven prompting. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 4937–4949
Tian X, Huang L, Lin Y, Bao S, He H, Yang Y, Wu H, Wang F, Sun S (2021) Amendable generation for dialogue state tracking. In: Proceedings of the 3rd workshop on natural language processing for conversational AI, pp 80–92
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Budzianowski P, Casanueva I, Tseng B, Gasic M (2018) Towards end-to-end multi-domain dialogue modelling
Chen W, Chen J, Qin P, Yan X, Wang WY (2019) Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3696–3709
Liu B, Lane I (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454
Ma H, Wang J, Qian L, Lin H (2021) HAN-ReGRU: hierarchical attention network with residual gated recurrent unit for emotion recognition in conversation. Neural Comput Appl 33(7):2685–2703
Kumar A, Ku P, Goyal A, Metallinou A, Hakkani-Tur D (2020) Ma-dst: Multi-attention-based scalable dialog state tracking. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8107–8114
Zhu S, Li J, Chen L, Yu K (2020) Efficient context and schema fusion networks for multi-domain dialogue state tracking. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 766–781
Dai Y, Li H, Li Y, Sun J, Huang F, Si L, Zhu X (2021) Preview, attend and review: schema-aware curriculum learning for multi-domain dialogue state tracking. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol 2: Short Papers, pp 879–885
Feng Y, Lipani A, Ye F, Zhang Q, Yilmaz E (2022) Dynamic schema graph fusion network for multi-domain dialogue state tracking. In: Proceedings of the 60th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 115–126
Feng Y, Wang Y, Li H (2021) A sequence-to-sequence approach to dialogue state tracking. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol 1: Long Papers, pp 1714–1725
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9
Bao S, He H, Wang F, Wu H, Wang H, Wu W, Guo Z, Liu Z, Xu X (2021) Plato-2: towards building an open-domain chatbot via curriculum learning. In: Findings of the association for computational linguistics: ACL-IJCNLP 2021, pp 2513–2525
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:5998–6008
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Hendrycks D, Gimpel K (2016) Bridging nonlinearities and stochastic regularizers with Gaussian error linear units. arXiv preprint arXiv:1606.08415
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 workshop on deep learning, December 2014
Press O, Wolf L (2017) Using the output embedding to improve language models. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 2: Short Papers, pp 157–163
Eric M, Goel R, Paul S, Sethi A, Agarwal S, Gao S, Hakkani-Tür D (2019) Multiwoz 2.1: Multi-domain dialogue state corrections and state tracking baselines
Zang X, Rastogi A, Sunkara S, Gupta R, Zhang J, Chen J (2020) Multiwoz 2.2: a dialogue dataset with additional annotation corrections and state tracking baselines. In: Proceedings of the 2nd workshop on natural language processing for conversational AI, pp 109–117
Wen T-H, Vandyke D, Mrkšić N, Gasic M, Barahona LMR, Su P-H, Ultes S, Young S (2017) A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 1: Long Papers, pp 438–449
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Pan B, Yang Y, Li B, Cai D (2021) Self-supervised attention flow for dialogue state tracking. Neurocomputing 440:279–286
Zhang J, Hashimoto K, Wu C-S, Wang Y, Philip SY, Socher R, Xiong C (2020) Find or classify? dual strategy for slot-value predictions on multi-domain dialog state tracking. In: Proceedings of the ninth joint conference on lexical and computational semantics, pp 154–167
Hosseini-Asl E, McCann B, Wu C-S, Yavuz S, Socher R (2020) A simple language model for task-oriented dialogue. Adv Neural Inf Process Syst 33:20179–20191
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qiu, J., Lin, Z., Zhang, H. et al. Hierarchical temporal slot interactions for dialogue state tracking. Neural Comput & Applic 35, 5791–5805 (2023). https://doi.org/10.1007/s00521-022-07959-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07959-y