Abstract
This paper proposes a hierarchical method for learning an efficient Dialogue Management (DM) strategy for task-oriented conversations serving multiple intents of a domain. Deep Reinforcement Learning (DRL) networks specializing in individual intents communicate with each other, having the capability of sharing overlapping information across intents. The sharing of information across state space and the presence of global slot tracker prohibits the agent to reask known information. Thus, the system is able to handle sub-dialogues based on subset of intents covered by different Reinforcement Learning (RL) models, thereby, completing the dialogue without again asking already provided information common across intents. The developed system has been demonstrated for “Air Travel” domain. The experimental results indicate that the developed system is efficient, scalable and can serve multiple intents based dialogues adequately. The proposed system when applied to 5-intent dialogue systems attains an improvement of 41% in terms of dialogue length as compared to a single-intent based system serving the same 5-intents.










Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Ran on Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz, 251GB RAM
References
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) A brief survey of deep reinforcement learning. arXiv:1708.05866
Bordes A, Boureau YL, Weston J (2016) Learning end-to-end goal-oriented dialog. arXiv:1605.07683
Budzianowski P, Ultes S, Su P, Mrksic N, Wen T, Casanueva I, Rojas-Barahona LM, Gasic M (2017) Sub-domain modelling for dialogue management with hierarchical reinforcement learning. In: Proceedings of the 18th annual SIGdial meeting on discourse and dialogue, Saarbrücken, Germany, August 15-17, 2017. https://doi.org/10.18653/v1/w17-5512, pp 86–92
Casanueva I, Budzianowski P, Su P, Ultes S, Rojas-Barahona LM, Tseng B, Gasic M (2018) Feudal reinforcement learning for dialogue management in large domains. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, vol 2 (Short Papers). https://aclanthology.info/papers/N18-2112/n18-2112, pp 714–719
Cuayáhuitl H (2017) Simpleds: a simple deep reinforcement learning dialogue system. In: Dialogues with social robots. Springer, pp 109–118
Cuayáhuitl H, Keizer S, Lemon O (2015) Strategic dialogue management via deep reinforcement learning. arXiv:1511.08099
Cuayáhuitl H, Yu S, Williamson A, Carse J (2016) Deep reinforcement learning for multi-domain dialogue systems. arXiv:1611.08675
Cuayáhuitl H, Yu S, et al. (2017) Deep reinforcement learning of dialogue policies with less weight updates
Fazel-Zarandi M, Li SW, Cao J, Casale J, Henderson P, Whitney D, Geramifard A (2017) Learning robust dialog policies in noisy environments. arXiv:1712.04034
Fraser N (1998) Assessment of interactive systems. In: Handbook of standards and resources for spoken language systems. Mouton de Gruyter, pp 564–615
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computation 9(8):1735–1780
Ilievski V, Musat C, Hossmann A, Baeriswyl M (2018) Goal-oriented chatbot dialog management bootstrapping with transfer learning. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. https://doi.org/10.24963/ijcai.2018/572, pp 4115–4121
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. Journal of Artificial Intelligence Research 4:237–285
Keizer S, Guhe M, Cuayáhuitl H, Efstathiou I, Engelbrecht KP, Dobre M, Lascarides A, Lemon O (2017) Evaluating persuasion strategies and deep reinforcement learning methods for negotiation dialogue agents. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: Volume 2, Short Papers, vol 2, pp 480–484
Levin E, Pieraccini R, Eckert W (1998) Using markov decision process for learning dialogue strategies. In: Proceedings of the 1998 IEEE international conference on acoustics, speech and signal processing, 1998, vol 1. IEEE, pp 201–204
Li X, Chen YN, Li L, Gao J, Celikyilmaz A (2017) End-to-end task-completion neural dialogue systems. arXiv:1703.01008
Lipton Z, Li X, Gao J, Li L, Ahmed F, Deng L (2018) Bbq-networks: efficient exploration in deep reinforcement learning for task-oriented dialogue systems. In: Thirty-second AAAI conference on artificial intelligence
McTear MF (1998) Modelling spoken dialogues with state transition diagrams: experiences with the cslu toolkit. In: Fifth international conference on spoken language processing
McTear MF (2002) Spoken dialogue technology: enabling the conversational user interface. ACM Computing Surveys (CSUR) 34(1):90–169
Meng TL, Khushi M (2019) Reinforcement learning in financial markets. Data 4(3):110. https://doi.org/10.3390/data4030110
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602
Peng B, Li X, Li L, Gao J, Çelikyilmaz A, Lee S, Wong K (2017) Composite task-completion dialogue policy learning via hierarchical deep reinforcement learning. In: Proceedings of the 2017 conference on empirical methods in natural language processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017. https://aclanthology.info/papers/D17-1237/d17-1237, pp 2231–2240
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Price PJ (1990) Evaluation of spoken language systems: the atis domain. In: Speech and natural language: proceedings of a workshop held at Hidden Valley, Pennsylvania, June 24-27, 1990
Saha T, Gupta D, Saha S, Bhattacharyya P (2018) Neural information processing. In: Cheng L, Leung ACS, Ozawa S (eds), Berlin, pp 359–372
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv:1511.05952
Serban IV, Sordoni A, Bengio Y, Courville AC, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI, vol 16, pp 3776–3784
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, vol 1. MIT Press, Cambridge
Tang D, Li X, Gao J, Wang C, Li L, Jebara T (2018) Subgoal discovery for hierarchical dialogue policy learning. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, 2018. https://aclanthology.info/papers/D18-1253/d18-1253, pp 2298–2309
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: AAAI, vol 16, pp 2094–2100
Wen TH, Vandyke D, Mrksic N, Gasic M, Rojas-Barahona LM, Su PH, Ultes S, Young S (2016) A network-based end-to-end trainable task-oriented dialogue system. arXiv:1604.04562
Xu P, Sarikaya R (2013) Exploiting shared information for multi-intent natural language sentence classification. In: INTERSPEECH 2013, 14th annual conference of the international speech communication association, Lyon, France, August 25-29, 2013. http://www.isca-speech.org/archive/interspeech_2013/i13_3785.html, pp 3785–3789
Zhao T, Eskenazi M (2016) Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning. arXiv:1606.02560
Acknowledgements
Dr. Sriparna Saha gratefully acknowledges the Young Faculty Research Fellowship (YFRF) Award, supported by Visvesvaraya Ph.D. Scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India, being implemented by Digital India Corporation (formerly Media Lab Asia) for carrying out this research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Saha, T., Gupta, D., Saha, S. et al. A hierarchical approach for efficient multi-intent dialogue policy learning. Multimed Tools Appl 80, 35025–35050 (2021). https://doi.org/10.1007/s11042-020-09070-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09070-7