Detecting changing emotions in human speech by machine and humans

van der Wal, C. Natalie; Kowalczyk, Wojtek

doi:10.1007/s10489-013-0449-1

Detecting changing emotions in human speech by machine and humans

Published: 15 June 2013

Volume 39, pages 675–691, (2013)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

C. Natalie van der Wal¹ &
Wojtek Kowalczyk²

720 Accesses
17 Citations
Explore all metrics

Abstract

The goals of this research were: (1) to develop a system that will automatically measure changes in the emotional state of a speaker by analyzing his/her voice, (2) to validate this system with a controlled experiment and (3) to visualize the results to the speaker in 2-d space. Natural (non-acted) human speech of 77 (Dutch) speakers was collected and manually divided into meaningful speech units. Three recordings per speaker were collected, in which he/she was in a positive, neutral and negative state. For each recording, the speakers rated 16 emotional states on a 10-point Likert Scale. The Random Forest algorithm was applied to 207 speech features that were extracted from recordings to qualify (classification) and quantify (regression) the changes in speaker’s emotional state. Results showed that predicting the direction of change of emotions and predicting the change of intensity, measured by Mean Squared Error, can be done better than the baseline (the most frequent class label and the mean value of change, respectively). Moreover, it turned out that changes in negative emotions are more predictable than changes in positive emotions. A controlled experiment investigated the difference in human and machine performance on judging the emotional states in one’s own voice and that of another. Results showed that humans performed worse than the algorithm in the detection and regression problems. Humans, just like the machine algorithm, were better in detecting changing negative emotions rather than positive ones. Finally, results of applying the Principal Component Analysis (PCA) to our data provided a validation of dimensional emotion theories and they suggest that PCA is a promising technique for visualizing user’s emotional state in the envisioned application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Batliner A, Steidle S, Schuller B, Seppi D, Vogt T, Wagner J, Vidrascu L, Aharonson V, Kessous L, Amir N (2010) Whodunnit—searching for the most important speech feature types signalling emotion-related user states in speech. Comput Speech Lang. doi:10.1016/j.csl.2009.12.003
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article MATH Google Scholar
Breazeal C, Brooks R (2005) Robot emotion: a functional perspective. In: Fellous J-M, Arbib MA (eds) Who needs emotions? Oxford University Press, New York
Google Scholar
Castellano G, Kessous G, Caridakis G (2008) Emotion recognition through multiple modalities: face, body gesture, speech. In: Peter C, Beale R (eds) Affect and emotion in human-computer interaction. Lecture notes in computer science, vol 4868. Springer, Berlin, pp 92–103
Chapter Google Scholar
Duda RO, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley, New York
Google Scholar
Ekman P (1992) An argument for basic emotions. Cogn Emot 6:169–200
Article Google Scholar
Fredrickson BL, Mancuso R, Branigan C, Tugade M (2000) The undoing effect of positive emotions. Motiv Emot 24:237–258
Article Google Scholar
Frijda NH (2007) The laws of emotion. Lawrence Erlbaum Associates Publishers, Hillsdale
Google Scholar
GAQ (2002) Geneva appraisal questionnaire. See: http://www.affective-sciences.org/system/files/page/2636/GAQ_English.PDF
Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning, 2nd edn. Springer, New York
Google Scholar
Kurematsu M, Amanuma S, Hakura J, Fujita H (2008) An extraction of emotion in human speech using cluster analysis and a regression tree. In: Fujita H, Sasaki J (eds) Proceedings of the 10th WSEAS international conference on applied computer science. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, pp 346–350
Google Scholar
Laukka P, Neiberg D, Forsell M, Karlsson I, Elenius K (2011) Expression of affect in spontaneous speech: acoustic correlates and automatic detection of irritation and resignation. Comput Speech Lang 25:84–104
Article Google Scholar
Li X, Tao J, Johnson M, Soltis J, Savage A, Leong K, Newman J (2007) Stress and emotion classification using jitter and shimmer features. In: IEEE international conference on acoustics, speech and signal processing (ICASSP 2007), pp 1081–1084
Google Scholar
van der Maaten LJP, Postma E, van der Herik H (2009) Dimensionality reduction: a comparative review. Tilburg University technical report, TiCC-TR 2009-005
McIntyre G, Göcke R (2007) Towards affective sensing. In: Jacko JA (ed) Proc of the 12th international conference on human-computer interaction: intelligent multimodal interaction environments, part III (HCI’07). Lecture notes in computer science, vol 4552. Springer, Berlin, pp 411–420
Chapter Google Scholar
Russel JA (1980) A circumplex model of affect. J Pers Soc Psychol 39:1161–1178
Article Google Scholar
Schölkopf B, Smola AJ (2001) Learning with kernels. support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
Google Scholar
Tawari A, Trivedi M (2010) Speech based emotion classification framework for driver assistance system. In: Intelligent vehicles symposium (IV), 21–24 June 2010 IEEE Press, New York, pp 174–178. doi:10.1109/IVS.2010.5547956
Google Scholar
Vogt T, André E, Wagner J (2007) Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In: Jacko JA (ed) Proc of the 12th international conference on human-computer interaction: intelligent multimodal interaction environments, part III (HCI’07). Lecture notes in computer science, vol 4552. Springer, Berlin, pp 75–91
Google Scholar
Yik M, Russel J, Steiger J (2011) A 12-point circumplex structure of core affect. Emotion 11(4):705–731
Article Google Scholar
Zhang C, Wu J, Xiao X, Wang Z (2006) Pronunciation variation modeling for Mandarin with accent. In: Proceedings of ICSLP’06, Pittsburgh, USA, pp 709–712
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Artificial Intelligence, VU University Amsterdam, De Boelelaan 1081, 1081, HV Amsterdam, The Netherlands
C. Natalie van der Wal
Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333, CA Leiden, The Netherlands
Wojtek Kowalczyk

Authors

C. Natalie van der Wal
View author publications
You can also search for this author in PubMed Google Scholar
Wojtek Kowalczyk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Natalie van der Wal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

van der Wal, C.N., Kowalczyk, W. Detecting changing emotions in human speech by machine and humans. Appl Intell 39, 675–691 (2013). https://doi.org/10.1007/s10489-013-0449-1

Download citation

Published: 15 June 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s10489-013-0449-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting changing emotions in human speech by machine and humans

Abstract

Access this article

Similar content being viewed by others

Emotion classification from speech signal based on empirical mode decomposition and non-linear features

Machine learning technique-based emotion classification using speech signals

PCA-Based Random Forest Classifier for Speech Emotion Recognition Using FFTF Features, Jitter, and Shimmer

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detecting changing emotions in human speech by machine and humans

Abstract

Access this article

Similar content being viewed by others

Emotion classification from speech signal based on empirical mode decomposition and non-linear features

Machine learning technique-based emotion classification using speech signals

PCA-Based Random Forest Classifier for Speech Emotion Recognition Using FFTF Features, Jitter, and Shimmer

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation