Natural Language Dialog System Considering Speaker’s Emotion Calculated from Acoustic Features

Takahashi, Takumi; Mera, Kazuya; Nhat, Tang Ba; Kurosawa, Yoshiaki; Takezawa, Toshiyuki

doi:10.1007/978-981-10-2585-3_11

Takumi Takahashi³,
Kazuya Mera³,
Tang Ba Nhat⁴,
Yoshiaki Kurosawa³ &
…
Toshiyuki Takezawa³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 427))

1504 Accesses
1 Citations

Abstract

With the development of Interactive Voice Response (IVR) systems , people can not only operate computer systems through task-oriented conversation but also enjoy non-task-oriented conversation with the computer. When an IVR system generates a response, it usually refers to just verbal information of the user’s utterance. However, when a person gloomily says “I’m fine,” people will respond not by saying “That’s wonderful” but “Really?” or “Are you OK?” because we can consider both verbal and non-verbal information such as tone of voice, facial expressions, gestures, and so on. In this article, we propose an intelligent IVR system that considers not only verbal but also non-verbal information. To estimate a speaker’s emotion (positive, negative, or neutral), 384 acoustic features extracted from the speaker’s utterance are utilized to machine learning (SVM). Artificial Intelligence Markup Language (AIML)-based response generating rules are expanded to be able to consider the speaker’s emotion. As a result of the experiment, subjects felt that the proposed dialog system was more likable, enjoyable, and did not give machine-like reactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

AIML—The Artificial Intelligence Markup Language. http://www.alicebot.org/aiml.html. Accessed 9 May 2016
Home Page of the Loebner Prize. http://www.loebner.net/Prizef/loebner-prize.html. Accessed 9 May 2016
Emotion Challenge—AAAC emotion-research.net—Association for the Advancement of Affective Computing. http://emotion-research.net/sigs/speech-sig/emotion-challenge. Accessed 9 May 2016
Eyben, F., Wöllmer, M., Schuller, B.: openSMILE: The Munich versatile and fast open-source audio feature extractor. In: Proceedings of the International Conference on Multimedia (2010)
Google Scholar
Lee, A., Kawahara, T., Shikano, K.: Real-time confidence scoring based on word posterior probability on two-pass search algorithm. Tech. Rep. IEICE 103(520), 35–40 (2003). (in Japanese)
Google Scholar
Ihaka, R., Gentleman, R.: A language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996)
Google Scholar
Takayoshi, K., Tanaka, T.: The relationship between the behavior of kuwata and impression of his intelligence & personality. IPSJ SIG Tech. Rep. 160, 43–48 (2007). (in Japanese)
Google Scholar

Download references

Acknowledgements

This research is supported by JSPS KAKENHI Grant Number 26330313 and the Center of Innovation Program from Japan Science and Technology Agency, JST.

Author information

Authors and Affiliations

Graduate School of Information Sciences, Hiroshima City University, 3-4-1, Ozuka-higashi, Asa-minami-ku, Hiroshima, 731-3194, Japan
Takumi Takahashi, Kazuya Mera, Yoshiaki Kurosawa & Toshiyuki Takezawa
FPT Software, Tokyo, Japan
Tang Ba Nhat

Authors

Takumi Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Kazuya Mera
View author publications
You can also search for this author in PubMed Google Scholar
Tang Ba Nhat
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiaki Kurosawa
View author publications
You can also search for this author in PubMed Google Scholar
Toshiyuki Takezawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazuya Mera .

Editor information

Editors and Affiliations

Institute of Behavioural Sciences, University of Helsinki Institute of Behavioural Sciences, Helsinki, Finland
Kristiina Jokinen
University of Helsinki , Helsinki, Finland
Graham Wilcock

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Takahashi, T., Mera, K., Nhat, T.B., Kurosawa, Y., Takezawa, T. (2017). Natural Language Dialog System Considering Speaker’s Emotion Calculated from Acoustic Features. In: Jokinen, K., Wilcock, G. (eds) Dialogues with Social Robots. Lecture Notes in Electrical Engineering, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-10-2585-3_11

Download citation

DOI: https://doi.org/10.1007/978-981-10-2585-3_11
Published: 25 December 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2584-6
Online ISBN: 978-981-10-2585-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics