Abstract
Recognition and expression of emotion are key factors to the success of multi-turn conversations. Emotion recognition that can help model the relationship between query and response is used to be employed in single-turn conversation models. However, little work focuses on infusing the emotional factor in multi-turn conversation generation so far. To alleviate these problems, we propose Multi-turn Emotional Conversation Model (MECM) by using multi-task learning, which improves the ability to represent emotions in multi-turn conversations. MECM is based on hierarchical latent variable model, that utilizes context hidden to sharing the common information. Besides it also contains an emotion classifier to help the model recognize the emotion in the conversation, and a conversation generator to maintain consistency of content and transformation of emotion. Experimental results show that our model significantly improves the quality of responses in terms of diversity and empathy, and keeps better performance on semantic similarity compared with baseline methods.
Similar content being viewed by others
References
Cui F, Cui Q, Song Y (2021) A survey on Learning-Based approaches for modeling and classification of Human–Machine dialog systems. IEEE Trans Neural Netw Learn Syst 32(4):1418–1432. https://doi.org/10.1109/TNNLS.2020.2985588
Partala T, Surakka V (2004) The effects of affective interventions in human–computer interaction. Interact Comput 295–309
Prendinger H, Ishizuka M (2005) The empathic companion: A character-based interface that addresses users’ affective states. Appl Artif Intell 19:267–285
Serban IV, Alessandro S, Yoshua B, Aaron C, Joelle P (2015) building End-To-End dialogue systems using generative hierarchical neural network models. In: Proceedings of the Thirtieth AAAI conference on artificial intelligence, pp 3776–3783
Iulian S, Alessandro S, Ryan L, Laurent C, Joelle P, Aaron C, Bengio Y (2017) A hierarchical latent variable Encoder-Decoder model for generating dialogues. In: Proceedings of the Thirty-First AAAI conference on artificial intelligence, pp 3295–3301
Kaisheng Y, Geoffrey Z, Baolin P (2015) Attention with Intention for a Neural Network Conversation Model, arXiv:1510.08565
Yookoon P, Jaemin C, Gunhee K (2018) A hierarchical latent structure for variational conversation modeling. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1792–1801
Lei S, Yang F, Haolan Z (2019) Modeling semantic relationship in multi-turn conversations with hierarchical latent variables. In: Proceedings of the 57th annual meeting of the association for computational linguistics
Adoma AF, Henry N-M, Wenyu C (2021) Transformer models for text-based emotion detection: A review of BERT-based approaches. Artificial Intelligence Review. https://doi.org/10.1007/s10462-021-09958-2
Yau-Hwang K, Meng-hsuan F, Wen-Hao T, Kuan-Rong L, Ling-Yu C (2016) Integrated microblog sentiment analysis from users’ social interaction patterns and textual opinions. Appl Intell 44:399–413. https://doi.org/10.1007/s10489-015-0700-z
Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307. https://doi.org/10.1162/COLI_a_00049
Kang H, Yoo SJ, Han D (2012) Senti-lexicon and improved Nave Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst Appl 39(5):6000–6010
Riaz S, Fatima M, Kamran M et al (2019) Opinion mining on large scale data using sentiment analysis and k-means clustering. Clust Comput 22:7149–7164. https://doi.org/10.1007/s10586-017-1077-z
Di Martino F et al (2019) A lightweight clustering-based approach to discover different emotional shades from social message streams. Int J Intell Syst 34(7):1505–1523. https://doi.org/10.1002/int.22105
Hao Z, Minlie H, Tianyang Z, Xiaoyan Z, Bing L (2017) Emotional chatting machine: Emotional conversation generation with internal and external memory. In: Thirty-Second AAAI conference on artificial intelligence
Sayan G, Mathieu C, Eugene L, Louis-Philippe M, Stefan S (2017) Affect-LM: A neural language model for customizable affective text generation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 1, pp 634–642
Asghar N, Poupart P, Hoey J, Jiang X, Mou L (2018) Affective neural response generation, advances in information retrieval. Springer International Publishing, Cham Switzerland, pp 154–166
Lei S, Yang F (2020) CDL: Curriculum dual learning for Emotion-Controllable response generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 556–566
Kyunghyun C, Bart vM, Caglar G, Fethi B, Holger S, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734
Li Y, Su UI, Shen X, Li W, Cao Z, Niu S (2017) DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In: Proceedings of the eighth international joint conference on natural language processing IJCNLP, vol 2017, pp 986–995
Poria S, Hazarika D, Majumder N, Cambria GNE, Mihalcea R (2019) MELD: A multimodal multi-party dataset for emotion recognition in conversations. In: Proceedings of the 57th conference of the association for computational linguistics ACL, vol 2019, pp 527–536
Diederik K, Jimmy B (2015) Adam: A method for stochastic optimization. In: The 3rd international conference on learning representations, p 13
Bowman Samuel R, Luke V, Oriol V, Andrew D, Rafal J, Samy B (2016) Generating sentences from a continuous space. In: Proceedings of The 20th SIGNLL conference on computational natural language learning, pp 10–21
Xi C, Diederik K, Tim S, Yan D, Prafulla D, John S, Ilya S, Pieter A (2017) Variational Lossy Autoencoder. In: The 5th international conference on learning representations
Kishore P, Salim R, Todd W, Wei-Jing Z (2002) BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 311–318
Forgues G, Pineau J, Larchevêque J-M, Tremblay Ŕ (2004) Bootstrapping dialog systems with word embeddings. In: Proceedings of the 42nd Annual Conference of the Association for Computational Linguistics, pp 605–612
Jiwei L, Michel G, Chris B, Jianfeng G, Bill D (2016) A Diversity-Promoting objective function for neural conversation models. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 110–119
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17), pp 6000–6010
Liangchen L, Jingjing X, Junyang L, Qi Zeng , Xu S (2018) An Auto-Encoder matching model for learning Utterance-Level semantic dependency in dialogue generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 702–707
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Appendix A
A The architectures of ‘-cls’ and ‘-ctx’ in ablation study of our MECM. The differences between MECM, ‘-cls’ and ‘-ctx’ in Figs. 1, 2 and 3 respectively are the controllers of emotional latent variables ze. In Fig. 1, ze is controlled by classifier and context information simultaneously. In Fig. 2, ze is guided by context information. In Fig. 3, there is only a classifier utilized to control ze.
Rights and permissions
About this article
Cite this article
Cui, F., Di, H., Shen, L. et al. Modeling semantic and emotional relationship in multi-turn emotional conversations using multi-task learning. Appl Intell 52, 4663–4673 (2022). https://doi.org/10.1007/s10489-021-02683-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02683-x