Generating Emotional Social Chatbot Responses with a Consistent Speaking Style

Zhang, Jun; Yang, Yan; Chen, Chengcai; He, Liang; Yu, Zhou

doi:10.1007/978-3-030-60457-8_5

Generating Emotional Social Chatbot Responses with a Consistent Speaking Style

Jun Zhang¹²,
Yan Yang¹²,
Chengcai Chen¹³,
Liang He¹² &
…
Zhou Yu¹⁴

Conference paper
First Online: 02 October 2020

2091 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12431))

Abstract

Emotional conversation plays a vital role in creating more human-like conversations. Although previous works on emotional conversation generation have achieved promising results, the issue of the speaking style inconsistency still exists. In this paper, we propose a Style-Aware Emotional Dialogue System (SEDS) to enhance speaking style consistency through detecting user’s emotions and modeling speaking styles in emotional response generation. Specifically, SEDS uses an emotion encoder to perceive the user’s emotion from multimodal inputs, and tracks speaking styles through jointly optimizing a generator that is augmented with a personalized lexicon to capture explicit word-level speaking style features. Additionally, we propose an auxiliary task, a speaking style classification task, to guide SEDS to learn the implicit form of speaking style during the training process. We construct a multimodal dialogue dataset and make the alignment and annotation to verify the effectiveness of the model. Experimental results show that our SEDS achieves a significant improvement over other strong baseline models in terms of perplexity, emotion accuracy and style consistency.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/562225807/SEDS.

References

Alam, F., Danieli, M., Riccardi, G.: Annotating and modeling empathy in spoken conversations. Comput. Speech Lang. 50, 40–61 (2018)
Article Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734 (2014)
Google Scholar
Choi, W.Y., Song, K.Y., Lee, C.W.: Convolutional attention networks for multimodal emotion recognition from speech and text data. In: Proceedings of the first Grand Challenge and Workshop on Human Multimodal Language (Challenge-HML), pp. 28–34 (2018)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on Deep Learning, December 2014
Google Scholar
Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Measure. 33(3), 613–619 (1973)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, J., Galley, M., Brockett, C., Spithourakis, G., Gao, J., Dolan, B.: A persona-based neural conversation model. In: ACL, pp. 994–1003 (2016)
Google Scholar
Prendinger, H., Mori, J., Ishizuka, M.: Using human physiology to evaluate subtle expressivity of a virtual quizmaster in a mathematical game. Int. J. Hum. Comput. Stud. 62(2), 231–245 (2005)
Article Google Scholar
Qian, Q., Huang, M., Zhao, H., Xu, J., Zhu, X.: Assigning personality/profile to a chatting machine for coherent conversation generation. In: IJCAI, pp. 4279–4285 (2018)
Google Scholar
Satt, A., Rozenberg, S., Hoory, R.: Efficient emotion recognition from speech using deep learning on spectrograms. In: INTERSPEECH, pp. 1089–1093 (2017)
Google Scholar
Shi, W., Yu, Z.: Sentiment adaptive end-to-end dialog systems. In: ACL, pp. 1509–1519 (2018)
Google Scholar
Song, Z., Zheng, X., Liu, L., Xu, M., Huang, X.J.: Generating responses with a specific emotion in dialog. In: ACL, pp. 3685–3695 (2019)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.: Sequence to sequence learning with neural networks. Advances in NIPS (2014)
Google Scholar
Vinyals, O., Le, Q.: A neural conversational model. arXiv preprint arXiv:1506.05869 (2015)
Xu, H., Zhang, H., Han, K., Wang, Y., Peng, Y., Li, X.: Learning alignment for multimodal emotion recognition from speech. Proc. Interspeech, pp. 3569–3573 (2019)
Google Scholar
Zadeh, A.B., Liang, P.P., Poria, S., Cambria, E., Morency, L.P.: Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In: ACL, pp. 2236–2246 (2018)
Google Scholar
Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: i have a dog, do you have pets too. In: ACL, pp. 2204–2213 (2018)
Google Scholar
Zhou, H., Huang, M., Zhang, T., Zhu, X., Liu, B.: Emotional chatting machine: emotional conversation generation with internal and external memory. In: AAAI (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, East China Normal University, Shanghai, China
Jun Zhang, Yan Yang & Liang He
Xiaoi Research, Shanghai, China
Chengcai Chen
University of California, Davis, USA
Zhou Yu

Authors

Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chengcai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Liang He
View author publications
You can also search for this author in PubMed Google Scholar
Zhou Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Yang .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Yang, Y., Chen, C., He, L., Yu, Z. (2020). Generating Emotional Social Chatbot Responses with a Consistent Speaking Style. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12431. Springer, Cham. https://doi.org/10.1007/978-3-030-60457-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-60457-8_5
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60456-1
Online ISBN: 978-3-030-60457-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)