Feature Analysis for Emotional Content Comparison in Speech

Mnasri, Zied; Rovetta, Stefano; Masulli, Francesco

doi:10.1007/978-3-030-29933-0_41

Zied Mnasri^19,20,
Stefano Rovetta¹⁹ &
Francesco Masulli^19,21

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1043))

Included in the following conference series:

UK Workshop on Computational Intelligence

880 Accesses
1 Citations

Abstract

Emotional content analysis is getting more and more present in speech-based human machine interaction, such as emotion recognition and expressive speech synthesis. In this framework, this paper aims to compare the emotional content of a pair of speech signals, uttered by different speakers and not necessarily having the same text. This exploratory work employs machine learning methods to analyze emotional content in speech from different angles: (a) Evaluate the relevance of the used features in the analysis of emotions, (b) Calculate the similarity of the emotional content independently from speakers and text. The final goal is to provide a metric to compare emotional content in speech. Such a metric would form the basis for higher-level tasks, such as clustering utterances by emotional content, or applying kernel methods for expressive speech analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., et al.: Combining efforts for improving automatic classification of emotional user states. In: Proceedings of 5th Slovenian and 1st International Language Technologies Conference (IS LTC 2006), Ljubljana, Slovenia (2006)
Google Scholar
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Ninth European Conference on Speech Communication and Technology (2005)
Google Scholar
Chen, L., Mao, X., Xue, Y., Cheng, L.L.: Speech emotion recognition: features and classification models. Dig. Signal Process. 22(6), 1154–1160 (2012)
Article MathSciNet Google Scholar
Deng, J., Zhang, Z., Marchi, E., Schuller, B.: Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp. 511–516. IEEE (2013)
Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2001)
MATH Google Scholar
Ekman, P.: An argument for basic emotions. Cogn. Emot. 6(3–4), 169–200 (1992)
Article Google Scholar
El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011)
Article Google Scholar
Eyben, F., Buchholz, S., Braunschweiler, N., Latorre, J., Wan, V., Gales, M.J., Knill, K.: Unsupervised clustering of emotion and voice styles for expressive TTS. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2012), pp. 4009–4012. IEEE (2012)
Google Scholar
Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., André, E., Busso, C., Devillers, L.Y., Epps, J., Laukka, P., Narayanan, S.S., et al.: The geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2016)
Article Google Scholar
Eyben, F., Wöllmer, M., Schuller, B.: openSMILE: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)
Google Scholar
Hansen, J.H., Bou-Ghazale, S.E.: Getting started with SUSAS: a speech under simulated and actual stress database. In: Fifth European Conference on Speech Communication and Technology (1997)
Google Scholar
Hozjan, V., Kačič, Z.: Context-independent multilingual emotion recognition from speech signals. Int. J. Speech Technol. 6(3), 311–320 (2003)
Article Google Scholar
Huang, Z.: An investigation of emotion changes from speech. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 733–736. IEEE (2015)
Google Scholar
Kadiri, S.R., Gangamohan, P., Gangashetty, S.V., Yegnanarayana, B.: Analysis of excitation source features of speech for emotion recognition. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Google Scholar
Kim, J., Saurous, R.: Emotion recognition from human speech using temporal information and deep learning. In: Annual Conference of the International Speech Communication Association, Interspeech 2018 (2018)
Google Scholar
Mao, X., Chen, L., Fu, L.: Multi-level speech emotion recognition based on HMM and ANN. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 7, pp. 225–229. IEEE (2009)
Google Scholar
Nicholson, J., Takahashi, K., Nakatsu, R.: Emotion recognition in speech using neural networks. Neural Comput. Appl. 9(4), 290–296 (2000)
Article Google Scholar
Nwe, T.L., Foo, S.W., De Silva, L.C.: Speech emotion recognition using hidden Markov models. Speech Commun. 41(4), 603–623 (2003)
Article Google Scholar
Plutchik, R.: The nature of emotions: human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am. Sci. 89(4), 344–350 (2001)
Article Google Scholar
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161 (1980)
Article Google Scholar
Schuller, B., Rigoll, G.: Recognising interest in conversational speech-comparing bag of frames and supra-segmental features. In: Proceedings of Interspeech 2009, Brighton, UK, pp. 1999–2002 (2009)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Hidden Markov model-based speech emotion recognition. In: 2003 Proceedings of Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 2, pp. II–1. IEEE (2003)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: 2004 Proceedings of Acoustics, Speech, and Signal Processing, (ICASSP 2004), vol. 1, pp. I–577. IEEE (2004)
Google Scholar
Schuller, B., Steidl, S., Batliner, A.: The INTERSPEECH 2009 emotion challenge. In: Tenth Annual Conference of the International Speech Communication Association (2009)
Google Scholar
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C., Narayanan, S.: The INTERSPEECH 2010 paralinguistic challenge. In: 2010 Proceedings of INTERSPEECH, Makuhari, Japan, pp. 2794–2797 (2010)
Google Scholar
Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., Chetouani, M., Weninger, F., Eyben, F., Marchi, E., et al.: The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In: Proceedings of 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013, Lyon, France (2013)
Google Scholar
Székely, E., Cabral, J.P., Cahill, P., Carson-Berndsen, J.: Clustering expressive speech styles in audiobooks using glottal source parameters. In: 12th Annual Conference of the International Speech Communication Association (2011)
Google Scholar
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: resources, features, and methods. Speech Commun. 48(9), 1162–1181 (2006)
Article Google Scholar
Wu, D., Parsons, T.D., Narayanan, S.S.: Acoustic feature analysis in speech emotion primitives estimation. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
Google Scholar

Download references

Acknowlegments

This work was supported by the research grant “Fondi di ricerca di ateneo 2016” of the university of Genova.

Author information

Authors and Affiliations

DIBRIS, University of Genova, Genoa, Italy
Zied Mnasri, Stefano Rovetta & Francesco Masulli
Electrical Engineering Department, ENIT, University Tunis El Manar, Tunis, Tunisia
Zied Mnasri
Sbarro Institute for Cancer Research and Molecular Medicine, Temple University, Philadelphia, USA
Francesco Masulli

Authors

Zied Mnasri
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Rovetta
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Masulli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zied Mnasri .

Editor information

Editors and Affiliations

School of Computing, University of Portsmouth, Portsmouth, UK
Zhaojie Ju
Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, UK
Longzhi Yang
Bristol Robotics Laboratory, University of the West of England, Bristol, UK
Chenguang Yang
School of Computing, University of Portsmouth, Portsmouth, Hampshire, UK
Alexander Gegov
School of Computing, University of Portsmouth, Portsmouth, UK
Dalin Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mnasri, Z., Rovetta, S., Masulli, F. (2020). Feature Analysis for Emotional Content Comparison in Speech. In: Ju, Z., Yang, L., Yang, C., Gegov, A., Zhou, D. (eds) Advances in Computational Intelligence Systems. UKCI 2019. Advances in Intelligent Systems and Computing, vol 1043. Springer, Cham. https://doi.org/10.1007/978-3-030-29933-0_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-29933-0_41
Published: 30 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29932-3
Online ISBN: 978-3-030-29933-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics