Abstract
Generating semantically and emotionally context-consistent responses is key to intelligent dialogue systems. Previous works mainly refer to the context in the dialogue history to generate semantically related responses, ignoring the potential emotion in the conversation. In addition, existing methods mainly fail to consider the emotional changes of interlocutors and emotional categories simultaneously. However, emotion is crucial to reflect the interlocutor’s intent. In this paper, we propose an Emotion Capture Chat Machine (ECCM) that is able to capture the explicit and underlying emotional signal in the context to generate appropriate responses. In detail, we design a hierarchical recursive encoder-decoder framework with two enhanced self-attention encoders to capture the semantic signal and emotional signal, respectively, which are then fused in the decoder to produce the response. In general, we consider the dynamic and potential information of emotion to generate the response in multi-turn dialogues in the field of both daily conversation and psychological counseling. Our experimental results on a daily Chinese conversation dataset and a psychological counseling dataset show that ECCM outperforms the state-of-the-art baselines in terms of Perplexity, Distinct-1, Distinct-2, and manual evaluation. In addition, we find that ECCM performs well for input contexts with different lengths.
Similar content being viewed by others
Notes
The dataset is available at https://github.com/codemayq/chinese_chatbot_corpus
The dataset is available at https://www.52nlp.cn/efaqa-corpus-zh
References
Asghar N, Poupart P, Hoey J et al (2018) Affective neural response generation. In: Advances in Information Retrieval - 40th European Conference on Research, pp 154–166. https://doi.org/10.1007/978-3-319-76941-7_12
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations. arXiv:1409.0473
Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev 60:223–311. https://doi.org/10.1137/16M1080173
Dziri N, Kamalloo E, Mathewson KW et al (2018) Augmenting neural response generation with context-aware topical attention. CoRR arXiv:1811.01063. https://doi.org/10.18653/v1/w19-4103
Ghosh S, Chollet M, Laksana E et al (2017) Affect-lm: Neural language model for customizable affective text generation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp 634–642. https://doi.org/10.18653/v1/P17-1059
Gonċalves VP, Costa EP, Valejo A, et al (2017) Enhancing intelligence in multimodal emotion assessments. Appl Intell 46:470–486. https://doi.org/10.1007/s10489-016-0842-7
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput:1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Ji Z, Lu Z, Li H (2014) An information retrieval approach to short text conversation. CoRR arXiv:1408.6988
Khandelwal U, He H, Qi P, Jurafsky D (2018) Sharp nearby, fuzzy far away: How neural language models use context. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp 284–294. https://doi.org/10.18653/v1/P18-1027
Kumar A, Irsoy O, Ondruska P et al (2016) Ask me anything: Dynamic memory networks for natural language processing. In: Proceedings of the 33nd International Conference on Machine Learning, pp 1378–1387. arXiv:1506.07285
Li H, Wen G (2019) Sample awareness-based personalized facial expression recognition. Appl Intell 49:2956–2969. https://doi.org/10.1007/s10489-019-01427-2
Li J, Sun X (2018) A syntactically constrained bidirectional-asynchronous approach for emotional conversation generation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 678–683. https://doi.org/10.18653/v1/d18-1071
Lin Z, Feng M, dos Santos CN et al (2017) A structured self-attentive sentence embedding. In: 5th International Conference on Learning Representations. https://openreview.net/forum?id=BJC_jUqxe
Lubis N, Sakti S, Yoshino K et al (2018) Eliciting positive emotion through affect-sensitive dialogue response generation: A neural network approach. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp 5293–5300. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16317
Mayer JD, Salovey P (1997) What is emotional intelligence? Emotional Development and Emotional Intelligence, pp 3–31. https://psycnet.apa.org/record/1997-08644-001
Mikolov T, Chen K, Corrado G et al (2013) Efficient estimation of word representations in vector space. In: 1st International Conference on Learning Representations. 1301.3781
Pietquin O, Hastie HF (2013) A survey on metrics for the evaluation of user simulations. Knowl Eng Rev 28:59–73. https://doi.org/10.1017/S0269888912000343
Plutchik R (1980) A general psychoevolutionary theory of emotion. In: Theories of emotion, pp 3–33. https://doi.org/10.1016/C2013-0-11313-X
Poria S, Majumder N, Mihalcea R et al (2019) Emotion recognition in conversation: Research challenges, datasets, and recent advances. IEEE Access 7:100943–100953. https://doi.org/10.1109/ACCESS.2019.2929050
Ritter A, Cherry C, Dolan WB (2011) Data-driven response generation in social media. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp 583–593. https://aclanthology.org/D11-1054/
Serban IV, Sordoni A, Bengio Y et al (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp 3776–3784. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11957
Serban IV, Sordoni A, Lowe R, et al (2017) A hierarchical latent variable encoder-decoder model for generating dialogues. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp 3295–3301, http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14567
Shang L, Lu Z, Li H (2015) Neural responding machine for short-text conversation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp 1577–1586. https://doi.org/10.3115/v1/p15-1152
Sordoni A, Galley M, Auli M et al (2015) A neural network approach to context-sensitive generation of conversational responses. In: The 2015 Conference of the North American Chapter of the Association for Computational Linguistics, Human Language Technologies, pp 196–205. https://doi.org/10.3115/v1/n15-1020
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. CoRR arXiv:1409.3215
Tang D, Wei F, Yang N et al (2014) Learning sentiment-specific word embedding for twitter sentiment classification. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp 1555–1565. https://doi.org/10.3115/v1/p14-1146
Vinyals O, Le QV (2015) A neural conversational model. CoRR arXiv:1506.05869
Vinyals O, Kaiser L, Koo T et al (2015) Grammar as a foreign language. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing System, pp 2773–2781. arXiv:1412.7449
Wei W, Liu J, Mao X et al (2019) Emotion-aware chat machine: Automatic emotional response generation for human-like emotional interaction. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp 1401–1410. https://doi.org/10.1145/3357384.3357937
Wu X, Du Z, Guo Y, et al (2019) Hierarchical attention based long short-term memory for chinese lyric generation. Appl Intell 49:44–52. https://doi.org/10.1007/s10489-018-1206-2
Wu Y, Wu W, Xing C et al (2017) Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp 496–505. https://doi.org/10.18653/v1/P17-1046
Xing C, Wu Y, Wu W et al (2018) Hierarchical recurrent attention network for response generation. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp 5610–5617 https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16510
Yao K, Zweig G, Peng B (2015) Attention with intention for a neural network conversation model. CoRR arXiV:1510.08565
Zhou H, Huang M, Zhang T et al (2018) Emotional chatting machine: Emotional conversation generation with internal and external memory. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp 730–739. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16455
Zhuang Y, Wang X, Zhang H et al (2017) An ensemble approach to conversation generation. In: Natural Language Processing and Chinese Computing - 6th CCF International Conference, pp 51–62. https://doi.org/10.1007/978-3-319-73618-1_5
Acknowledgements
We are especially grateful to participants for completing the human evaluation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mao, Y., Cai, F., Guo, Y. et al. Incorporating emotion for response generation in multi-turn dialogues. Appl Intell 52, 7218–7229 (2022). https://doi.org/10.1007/s10489-021-02819-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02819-z