Skip to main content
Log in

A Unified Shared-Private Network with Denoising for Dialogue State Tracking

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Dialogue state tracking (DST) leverages dialogue information to predict dialogues states which are generally represented as slot-value pairs. However, previous work usually has limitations to efficiently predict values due to the lack of a powerful strategy for generating values from both the dialogue history and the predefined values. By predicting values from the predefined value set, previous discriminative DST methods are difficult to handle unknown values. Previous generative DST methods determine values based on mentions in the dialogue history, which makes it difficult for them to handle uncovered and non-pointable mentions. Besides, existing generative DST methods usually ignore the unlabeled instances and suffer from the label noise problem, which limits the generation of mentions and eventually hurts performance. In this paper, we propose a unified shared-private network (USPN) to generate values from both the dialogue history and the predefined values through a unified strategy. Specifically, USPN uses an encoder to construct a complete generative space for each slot and to discern shared information between slots through a shared-private architecture. Then, our model predicts values from the generative space through a shared-private decoder. We further utilize reinforcement learning to alleviate the label noise problem by learning indirect supervision from semantic relations between conversational words and predefined slot-value pairs. Experimental results on three public datasets show the effectiveness of USPN by outperforming state-of-the-art baselines in both supervised and unsupervised DST tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lei W, Jin X, Kan M Y, Ren Z, He X, Yin D. Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, July 2018, pp.1437-1447. DOI: 10.18653/v1/P18-1133.

  2. Wu C, Socher R, Xiong C. Global-to-local memory pointer networks for task-oriented dialogue. In Proc. the 7th International Conference on Learning Representations, May 2019.

  3. Xu P, Hu Q. An end-to-end approach for handling unknown slot values in dialogue state tracking. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, July 2018, pp.1448-1457. DOI: 10.18653/v1/P18-1134.

  4. Zhong V, Xiong C, Socher R. Global-locally self-attentive encoder for dialogue state tracking. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, July 2018, pp.1458-1467. DOI: 10.18653/v1/P18-1135.

  5. Mrkšić N, Séaghdha D Ó, Wen T H, Thomson B, Young S. Neural belief tracker: Data-driven dialogue state tracking. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 30-August 4, 2017, pp.1777-1788. DOI: 10.18653/v1/P17-1163.

  6. Ren L, Ni J, McAuley J. Scalable and accurate dialogue state tracking via hierarchical sequence generation. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, November 2019, pp.1876-1885. DOI: https://doi.org/10.18653/v1/D19-1196.

  7. Jang Y, Ham J, Lee B J, Chang Y, Kim K E. Neural dialog state tracker for large ontologies by attention mechanism. In Proc. the 2016 IEEE Spoken Language Technology Workshop, December 2016, pp.531-537. DOI: https://doi.org/10.1109/SLT.2016.7846314.

  8. Mesnil G, Dauphin Y, Yao K et al. Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech and Language Processing, 2017, 23(3): 530-539. DOI: https://doi.org/10.1109/TASLP.2014.2383614.

    Article  Google Scholar 

  9. Henderson M, Thomson B, Young S. Word-based dialog state tracking with recurrent neural networks. In Proc. the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue, June 2014, pp.292-299. DOI: 10.3115/v1/W14-4340.

  10. Wen T H, Vandyke D, Mrkšić N, Gašić M, Rojas-Barahona L M, Su P H, Ultes S, Young S. A network-based end-to-end trainable task-oriented dialogue system. In Proc. the 15th Conference of the European Chapter of the Association for Computational Linguistics, April 2017, pp.438-449.

  11. Ren L, Xie K, Chen L, Yu K. Towards universal dialogue state tracking. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 4, 2018, pp.2780-2786. DOI: 10.18653/v1/D18-1299.

  12. Trinh A, Ross R, Kelleher J. Energy-based modelling for dialogue state tracking. In Proc. the 1st Workshop on NLP for Conversational AI, August 2019, pp.77-86. DOI: 10.18653/v1/W19-4109.

  13. Rastogi A, Hakkani-Tür D, Heck L P. Scalable multidomain dialogue state tracking. arXiv:1712.10224, 2017. http://arxiv.org/abs/1712.10224, December 2020.

  14. Chao G L, Lane I. BERT-DST: Scalable end-to-end dialogue state tracking with bidirectional encoder representations from transformer. In Proc. the 20th Annual Conference of the International Speech Communication Association, September 2019, pp.1468-1472. DOI: https://doi.org/10.21437/interspeech.2019-1355.

  15. Ren H, Xu W, Yan Y. Markovian discriminative modeling for cross-domain dialog state tracking. In Proc. the 2014 IEEE Spoken Language Technology Workshop, December 2014, pp.342-347. DOI: https://doi.org/10.1109/SLT.2014.7078598.

  16. Mrkšić N, Séaghdha D Ó, Thomson B, Gašić M, Su P H, Vandyke D, Wen T H, Young S. Multi-domain dialog state tracking using recurrent neural networks. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, July 2015, pp.794-799. DOI: 10.3115/v1/P15-2130.

  17. Wu C S, Madotto A, Hosseini-Asl E, Xiong C, Socher R, Fung P. Transferable multi-domain state generator for taskoriented dialogue systems. In Proc. the 57th Conference of the Association for Computational Linguistics, July 2019, pp.808-819. DOI: 10.18653/v1/P19-1078.

  18. Zhang J G, Hashimoto K, Wu C S, Wan Y, Yu P, Socher R, Xiong C. Find or classify? Dual strategy for slotvalue predictions on multi-domain dialog state tracking. arXiv:1910.03544, 2019. http://arxiv.org/abs/1910.03544, October 2020.

  19. Chen W, Chen J, Su Y, Wang X, Yu D, Yan X, Wang W. XL-NBT: A cross-lingual neural belief tracking framework. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 4, 2018, pp.414-424. DOI: 10.18653/v1/D18-1038.

  20. Chen Y, Hakkani-Tür D, He X. Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models. In Proc. the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, March 2016, pp.6045-6049. DOI: 10.1109/ICASSP.2016.7472838.

  21. Chen Y N, Wang W Y, Gershman A, Rudnicky A. Matrix factorization with knowledge graph propagation for unsupervised spoken language understanding. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, July 2015, pp.483-494. DOI: 10.3115/v1/P15-1047.

  22. Jin X, Lei W, Ren Z, Chen H, Liang S, Zhao Y, Yin D. Explicit state tracking with semi-supervision for neural dialogue generation. arXiv:1808.10596, 2018. http://arxiv.org/abs/1808.10596, October 2020.

  23. Zhao T, Xie K, Eskénazi M. Rethinking action spaces for reinforcement learning in end-to-end dialog agents with latent variable models. In Proc. the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2019, pp.1208-1218. DOI: 10.18653/v1/N19-1123.

  24. Chen L, Tan B, Long S, Yu K. Structured dialogue policy with graph neural networks. In Proc. the 27th International Conference on Computational Linguistics, August 2018, pp.1257-1268.

  25. Gu J, Lu Z, Li H, Li V O. Incorporating copying mechanism in sequence-to-sequence learning. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.1631-1640. DOI: 10.18653/v1/P16-1154.

  26. Cao P, Chen Y, Liu K, Zhao J, Liu S. Adversarial transfer learning for Chinese named entity recognition with selfattention mechanism. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 2018, pp.182-192. DOI: https://doi.org/10.18653/v1/D18-1017.

  27. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  28. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I. Attention is all you need. In Proc. the 2017 Annual Conference on Neural Information Processing Systems, December 2017, pp.5998-6008.

  29. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, 2014. http://arxiv.org/abs/1409.0473, September 2019.

  30. Tu Z, Lu Z, Liu Y, Liu X, Li H. Modeling coverage for neural machine translation. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.76-85. DOI: 10.18653/v1/P16-1008.

  31. Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In Proc. the 2014 Annual Conference on Neural Information Processing Systems, December 2014, pp.3104-3112.

  32. Osband I, Van Roy B. Why is posterior sampling better than optimism for reinforcement learning? In Proc. the 34th International Conference on Machine Learning, August 2017, pp.2701-2710.

  33. Henderson M, Thomson B, Williams J D. The second dialog state tracking challenge. In Proc. the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue, June 2014, pp.263-272. DOI: 10.3115/v1/W14-4337.

  34. Bordes A, Boureau Y L, Weston J. Learning endto-end goal-oriented dialog. arXiv:1605.07683, 2016. http://arxiv.org/abs/1605.07683, October 2019.

  35. Pennington J, Socher R, Manning C. GloVe: Global vectors for word representation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, October 2014, pp.1532-1543. DOI: https://doi.org/10.3115/v1/D14-1162.

  36. Hashimoto K, Xiong C, Tsuruoka Y, Socher R. A joint many-task model: Growing a neural network for multiple NLP tasks. arXiv:1611.01587, 2016. http://arxiv.org/abs/1611.01587, November 2019.

  37. Kingma D P, Ba J L. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014. http://arxiv.org/abs/1-412.6980, October 2019.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shi-Zhu He.

Supplementary Information

ESM 1

(PDF 222 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, QB., He, SZ., Liu, K. et al. A Unified Shared-Private Network with Denoising for Dialogue State Tracking. J. Comput. Sci. Technol. 36, 1407–1419 (2021). https://doi.org/10.1007/s11390-020-0338-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-020-0338-0

Keywords

Navigation