Skip to main content
Log in

Modeling semantic and emotional relationship in multi-turn emotional conversations using multi-task learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Recognition and expression of emotion are key factors to the success of multi-turn conversations. Emotion recognition that can help model the relationship between query and response is used to be employed in single-turn conversation models. However, little work focuses on infusing the emotional factor in multi-turn conversation generation so far. To alleviate these problems, we propose Multi-turn Emotional Conversation Model (MECM) by using multi-task learning, which improves the ability to represent emotions in multi-turn conversations. MECM is based on hierarchical latent variable model, that utilizes context hidden to sharing the common information. Besides it also contains an emotion classifier to help the model recognize the emotion in the conversation, and a conversation generator to maintain consistency of content and transformation of emotion. Experimental results show that our model significantly improves the quality of responses in terms of diversity and empathy, and keeps better performance on semantic similarity compared with baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Cui F, Cui Q, Song Y (2021) A survey on Learning-Based approaches for modeling and classification of Human–Machine dialog systems. IEEE Trans Neural Netw Learn Syst 32(4):1418–1432. https://doi.org/10.1109/TNNLS.2020.2985588

    Article  Google Scholar 

  2. Partala T, Surakka V (2004) The effects of affective interventions in human–computer interaction. Interact Comput 295–309

  3. Prendinger H, Ishizuka M (2005) The empathic companion: A character-based interface that addresses users’ affective states. Appl Artif Intell 19:267–285

    Article  Google Scholar 

  4. Serban IV, Alessandro S, Yoshua B, Aaron C, Joelle P (2015) building End-To-End dialogue systems using generative hierarchical neural network models. In: Proceedings of the Thirtieth AAAI conference on artificial intelligence, pp 3776–3783

  5. Iulian S, Alessandro S, Ryan L, Laurent C, Joelle P, Aaron C, Bengio Y (2017) A hierarchical latent variable Encoder-Decoder model for generating dialogues. In: Proceedings of the Thirty-First AAAI conference on artificial intelligence, pp 3295–3301

  6. Kaisheng Y, Geoffrey Z, Baolin P (2015) Attention with Intention for a Neural Network Conversation Model, arXiv:1510.08565

  7. Yookoon P, Jaemin C, Gunhee K (2018) A hierarchical latent structure for variational conversation modeling. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1792–1801

  8. Lei S, Yang F, Haolan Z (2019) Modeling semantic relationship in multi-turn conversations with hierarchical latent variables. In: Proceedings of the 57th annual meeting of the association for computational linguistics

  9. Adoma AF, Henry N-M, Wenyu C (2021) Transformer models for text-based emotion detection: A review of BERT-based approaches. Artificial Intelligence Review. https://doi.org/10.1007/s10462-021-09958-2

  10. Yau-Hwang K, Meng-hsuan F, Wen-Hao T, Kuan-Rong L, Ling-Yu C (2016) Integrated microblog sentiment analysis from users’ social interaction patterns and textual opinions. Appl Intell 44:399–413. https://doi.org/10.1007/s10489-015-0700-z

    Article  Google Scholar 

  11. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307. https://doi.org/10.1162/COLI_a_00049

    Article  Google Scholar 

  12. Kang H, Yoo SJ, Han D (2012) Senti-lexicon and improved Nave Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst Appl 39(5):6000–6010

    Article  Google Scholar 

  13. Riaz S, Fatima M, Kamran M et al (2019) Opinion mining on large scale data using sentiment analysis and k-means clustering. Clust Comput 22:7149–7164. https://doi.org/10.1007/s10586-017-1077-z

    Article  Google Scholar 

  14. Di Martino F et al (2019) A lightweight clustering-based approach to discover different emotional shades from social message streams. Int J Intell Syst 34(7):1505–1523. https://doi.org/10.1002/int.22105

    Article  Google Scholar 

  15. Hao Z, Minlie H, Tianyang Z, Xiaoyan Z, Bing L (2017) Emotional chatting machine: Emotional conversation generation with internal and external memory. In: Thirty-Second AAAI conference on artificial intelligence

  16. Sayan G, Mathieu C, Eugene L, Louis-Philippe M, Stefan S (2017) Affect-LM: A neural language model for customizable affective text generation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 1, pp 634–642

  17. Asghar N, Poupart P, Hoey J, Jiang X, Mou L (2018) Affective neural response generation, advances in information retrieval. Springer International Publishing, Cham Switzerland, pp 154–166

    Book  Google Scholar 

  18. Lei S, Yang F (2020) CDL: Curriculum dual learning for Emotion-Controllable response generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 556–566

  19. Kyunghyun C, Bart vM, Caglar G, Fethi B, Holger S, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734

  20. Li Y, Su UI, Shen X, Li W, Cao Z, Niu S (2017) DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In: Proceedings of the eighth international joint conference on natural language processing IJCNLP, vol 2017, pp 986–995

  21. Poria S, Hazarika D, Majumder N, Cambria GNE, Mihalcea R (2019) MELD: A multimodal multi-party dataset for emotion recognition in conversations. In: Proceedings of the 57th conference of the association for computational linguistics ACL, vol 2019, pp 527–536

  22. Diederik K, Jimmy B (2015) Adam: A method for stochastic optimization. In: The 3rd international conference on learning representations, p 13

  23. Bowman Samuel R, Luke V, Oriol V, Andrew D, Rafal J, Samy B (2016) Generating sentences from a continuous space. In: Proceedings of The 20th SIGNLL conference on computational natural language learning, pp 10–21

  24. Xi C, Diederik K, Tim S, Yan D, Prafulla D, John S, Ilya S, Pieter A (2017) Variational Lossy Autoencoder. In: The 5th international conference on learning representations

  25. Kishore P, Salim R, Todd W, Wei-Jing Z (2002) BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 311–318

  26. Forgues G, Pineau J, Larchevêque J-M, Tremblay Ŕ (2004) Bootstrapping dialog systems with word embeddings. In: Proceedings of the 42nd Annual Conference of the Association for Computational Linguistics, pp 605–612

  27. Jiwei L, Michel G, Chris B, Jianfeng G, Bill D (2016) A Diversity-Promoting objective function for neural conversation models. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 110–119

  28. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17), pp 6000–6010

  29. Liangchen L, Jingjing X, Junyang L, Qi Zeng , Xu S (2018) An Auto-Encoder matching model for learning Utterance-Level semantic dependency in dialogue generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 702–707

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinan Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

A The architectures of ‘-cls’ and ‘-ctx’ in ablation study of our MECM. The differences between MECM, ‘-cls’ and ‘-ctx’ in Figs. 12 and 3 respectively are the controllers of emotional latent variables ze. In Fig. 1, ze is controlled by classifier and context information simultaneously. In Fig. 2, ze is guided by context information. In Fig. 3, there is only a classifier utilized to control ze.

Fig. 1
figure 1

Graphical model of MECM. ut is the t-th utterance, \(h_{t}^{ctx}\) represents the t-th context information, zc, ze and zr are global latent variables, emotional latent variables and utterance-level latent variables, respectively. CLS is a classifier and \(L_{n}^{\prime }\) is the output label.The dotted line indicates it do not exist in the test. The more details of our MECM are given in Section 2

Fig. 2
figure 2

Graphical model of emotional latent variables without control by classifier of MECM

Fig. 3
figure 3

Graphical model of emotional latent variables without control by context of MECM

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, F., Di, H., Shen, L. et al. Modeling semantic and emotional relationship in multi-turn emotional conversations using multi-task learning. Appl Intell 52, 4663–4673 (2022). https://doi.org/10.1007/s10489-021-02683-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02683-x

Keywords

Navigation