Skip to main content
Log in

A hierarchical approach for efficient multi-intent dialogue policy learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a hierarchical method for learning an efficient Dialogue Management (DM) strategy for task-oriented conversations serving multiple intents of a domain. Deep Reinforcement Learning (DRL) networks specializing in individual intents communicate with each other, having the capability of sharing overlapping information across intents. The sharing of information across state space and the presence of global slot tracker prohibits the agent to reask known information. Thus, the system is able to handle sub-dialogues based on subset of intents covered by different Reinforcement Learning (RL) models, thereby, completing the dialogue without again asking already provided information common across intents. The developed system has been demonstrated for “Air Travel” domain. The experimental results indicate that the developed system is efficient, scalable and can serve multiple intents based dialogues adequately. The proposed system when applied to 5-intent dialogue systems attains an improvement of 41% in terms of dialogue length as compared to a single-intent based system serving the same 5-intents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Ran on Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz, 251GB RAM

References

  1. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) A brief survey of deep reinforcement learning. arXiv:1708.05866

  2. Bordes A, Boureau YL, Weston J (2016) Learning end-to-end goal-oriented dialog. arXiv:1605.07683

  3. Budzianowski P, Ultes S, Su P, Mrksic N, Wen T, Casanueva I, Rojas-Barahona LM, Gasic M (2017) Sub-domain modelling for dialogue management with hierarchical reinforcement learning. In: Proceedings of the 18th annual SIGdial meeting on discourse and dialogue, Saarbrücken, Germany, August 15-17, 2017. https://doi.org/10.18653/v1/w17-5512, pp 86–92

  4. Casanueva I, Budzianowski P, Su P, Ultes S, Rojas-Barahona LM, Tseng B, Gasic M (2018) Feudal reinforcement learning for dialogue management in large domains. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, vol 2 (Short Papers). https://aclanthology.info/papers/N18-2112/n18-2112, pp 714–719

  5. Cuayáhuitl H (2017) Simpleds: a simple deep reinforcement learning dialogue system. In: Dialogues with social robots. Springer, pp 109–118

  6. Cuayáhuitl H, Keizer S, Lemon O (2015) Strategic dialogue management via deep reinforcement learning. arXiv:1511.08099

  7. Cuayáhuitl H, Yu S, Williamson A, Carse J (2016) Deep reinforcement learning for multi-domain dialogue systems. arXiv:1611.08675

  8. Cuayáhuitl H, Yu S, et al. (2017) Deep reinforcement learning of dialogue policies with less weight updates

  9. Fazel-Zarandi M, Li SW, Cao J, Casale J, Henderson P, Whitney D, Geramifard A (2017) Learning robust dialog policies in noisy environments. arXiv:1712.04034

  10. Fraser N (1998) Assessment of interactive systems. In: Handbook of standards and resources for spoken language systems. Mouton de Gruyter, pp 564–615

  11. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computation 9(8):1735–1780

    Article  Google Scholar 

  12. Ilievski V, Musat C, Hossmann A, Baeriswyl M (2018) Goal-oriented chatbot dialog management bootstrapping with transfer learning. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. https://doi.org/10.24963/ijcai.2018/572, pp 4115–4121

  13. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. Journal of Artificial Intelligence Research 4:237–285

    Article  Google Scholar 

  14. Keizer S, Guhe M, Cuayáhuitl H, Efstathiou I, Engelbrecht KP, Dobre M, Lascarides A, Lemon O (2017) Evaluating persuasion strategies and deep reinforcement learning methods for negotiation dialogue agents. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: Volume 2, Short Papers, vol 2, pp 480–484

  15. Levin E, Pieraccini R, Eckert W (1998) Using markov decision process for learning dialogue strategies. In: Proceedings of the 1998 IEEE international conference on acoustics, speech and signal processing, 1998, vol 1. IEEE, pp 201–204

  16. Li X, Chen YN, Li L, Gao J, Celikyilmaz A (2017) End-to-end task-completion neural dialogue systems. arXiv:1703.01008

  17. Lipton Z, Li X, Gao J, Li L, Ahmed F, Deng L (2018) Bbq-networks: efficient exploration in deep reinforcement learning for task-oriented dialogue systems. In: Thirty-second AAAI conference on artificial intelligence

  18. McTear MF (1998) Modelling spoken dialogues with state transition diagrams: experiences with the cslu toolkit. In: Fifth international conference on spoken language processing

  19. McTear MF (2002) Spoken dialogue technology: enabling the conversational user interface. ACM Computing Surveys (CSUR) 34(1):90–169

    Article  Google Scholar 

  20. Meng TL, Khushi M (2019) Reinforcement learning in financial markets. Data 4(3):110. https://doi.org/10.3390/data4030110

    Article  Google Scholar 

  21. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602

  22. Peng B, Li X, Li L, Gao J, Çelikyilmaz A, Lee S, Wong K (2017) Composite task-completion dialogue policy learning via hierarchical deep reinforcement learning. In: Proceedings of the 2017 conference on empirical methods in natural language processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017. https://aclanthology.info/papers/D17-1237/d17-1237, pp 2231–2240

  23. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  24. Price PJ (1990) Evaluation of spoken language systems: the atis domain. In: Speech and natural language: proceedings of a workshop held at Hidden Valley, Pennsylvania, June 24-27, 1990

  25. Saha T, Gupta D, Saha S, Bhattacharyya P (2018) Neural information processing. In: Cheng L, Leung ACS, Ozawa S (eds), Berlin, pp 359–372

  26. Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv:1511.05952

  27. Serban IV, Sordoni A, Bengio Y, Courville AC, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI, vol 16, pp 3776–3784

  28. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, vol 1. MIT Press, Cambridge

    MATH  Google Scholar 

  29. Tang D, Li X, Gao J, Wang C, Li L, Jebara T (2018) Subgoal discovery for hierarchical dialogue policy learning. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, 2018. https://aclanthology.info/papers/D18-1253/d18-1253, pp 2298–2309

  30. Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: AAAI, vol 16, pp 2094–2100

  31. Wen TH, Vandyke D, Mrksic N, Gasic M, Rojas-Barahona LM, Su PH, Ultes S, Young S (2016) A network-based end-to-end trainable task-oriented dialogue system. arXiv:1604.04562

  32. Xu P, Sarikaya R (2013) Exploiting shared information for multi-intent natural language sentence classification. In: INTERSPEECH 2013, 14th annual conference of the international speech communication association, Lyon, France, August 25-29, 2013. http://www.isca-speech.org/archive/interspeech_2013/i13_3785.html, pp 3785–3789

  33. Zhao T, Eskenazi M (2016) Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning. arXiv:1606.02560

Download references

Acknowledgements

Dr. Sriparna Saha gratefully acknowledges the Young Faculty Research Fellowship (YFRF) Award, supported by Visvesvaraya Ph.D. Scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India, being implemented by Digital India Corporation (formerly Media Lab Asia) for carrying out this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sriparna Saha.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saha, T., Gupta, D., Saha, S. et al. A hierarchical approach for efficient multi-intent dialogue policy learning. Multimed Tools Appl 80, 35025–35050 (2021). https://doi.org/10.1007/s11042-020-09070-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09070-7

Keywords

Navigation