Skip to main content

Feature Analysis for Emotional Content Comparison in Speech

  • Conference paper
  • First Online:
Advances in Computational Intelligence Systems (UKCI 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1043))

Included in the following conference series:

Abstract

Emotional content analysis is getting more and more present in speech-based human machine interaction, such as emotion recognition and expressive speech synthesis. In this framework, this paper aims to compare the emotional content of a pair of speech signals, uttered by different speakers and not necessarily having the same text. This exploratory work employs machine learning methods to analyze emotional content in speech from different angles: (a) Evaluate the relevance of the used features in the analysis of emotions, (b) Calculate the similarity of the emotional content independently from speakers and text. The final goal is to provide a metric to compare emotional content in speech. Such a metric would form the basis for higher-level tasks, such as clustering utterances by emotional content, or applying kernel methods for expressive speech analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., et al.: Combining efforts for improving automatic classification of emotional user states. In: Proceedings of 5th Slovenian and 1st International Language Technologies Conference (IS LTC 2006), Ljubljana, Slovenia (2006)

    Google Scholar 

  2. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Ninth European Conference on Speech Communication and Technology (2005)

    Google Scholar 

  3. Chen, L., Mao, X., Xue, Y., Cheng, L.L.: Speech emotion recognition: features and classification models. Dig. Signal Process. 22(6), 1154–1160 (2012)

    Article  MathSciNet  Google Scholar 

  4. Deng, J., Zhang, Z., Marchi, E., Schuller, B.: Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp. 511–516. IEEE (2013)

    Google Scholar 

  5. Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2001)

    MATH  Google Scholar 

  6. Ekman, P.: An argument for basic emotions. Cogn. Emot. 6(3–4), 169–200 (1992)

    Article  Google Scholar 

  7. El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011)

    Article  Google Scholar 

  8. Eyben, F., Buchholz, S., Braunschweiler, N., Latorre, J., Wan, V., Gales, M.J., Knill, K.: Unsupervised clustering of emotion and voice styles for expressive TTS. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2012), pp. 4009–4012. IEEE (2012)

    Google Scholar 

  9. Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., André, E., Busso, C., Devillers, L.Y., Epps, J., Laukka, P., Narayanan, S.S., et al.: The geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2016)

    Article  Google Scholar 

  10. Eyben, F., Wöllmer, M., Schuller, B.: openSMILE: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)

    Google Scholar 

  11. Hansen, J.H., Bou-Ghazale, S.E.: Getting started with SUSAS: a speech under simulated and actual stress database. In: Fifth European Conference on Speech Communication and Technology (1997)

    Google Scholar 

  12. Hozjan, V., Kačič, Z.: Context-independent multilingual emotion recognition from speech signals. Int. J. Speech Technol. 6(3), 311–320 (2003)

    Article  Google Scholar 

  13. Huang, Z.: An investigation of emotion changes from speech. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 733–736. IEEE (2015)

    Google Scholar 

  14. Kadiri, S.R., Gangamohan, P., Gangashetty, S.V., Yegnanarayana, B.: Analysis of excitation source features of speech for emotion recognition. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)

    Google Scholar 

  15. Kim, J., Saurous, R.: Emotion recognition from human speech using temporal information and deep learning. In: Annual Conference of the International Speech Communication Association, Interspeech 2018 (2018)

    Google Scholar 

  16. Mao, X., Chen, L., Fu, L.: Multi-level speech emotion recognition based on HMM and ANN. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 7, pp. 225–229. IEEE (2009)

    Google Scholar 

  17. Nicholson, J., Takahashi, K., Nakatsu, R.: Emotion recognition in speech using neural networks. Neural Comput. Appl. 9(4), 290–296 (2000)

    Article  Google Scholar 

  18. Nwe, T.L., Foo, S.W., De Silva, L.C.: Speech emotion recognition using hidden Markov models. Speech Commun. 41(4), 603–623 (2003)

    Article  Google Scholar 

  19. Plutchik, R.: The nature of emotions: human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am. Sci. 89(4), 344–350 (2001)

    Article  Google Scholar 

  20. Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161 (1980)

    Article  Google Scholar 

  21. Schuller, B., Rigoll, G.: Recognising interest in conversational speech-comparing bag of frames and supra-segmental features. In: Proceedings of Interspeech 2009, Brighton, UK, pp. 1999–2002 (2009)

    Google Scholar 

  22. Schuller, B., Rigoll, G., Lang, M.: Hidden Markov model-based speech emotion recognition. In: 2003 Proceedings of Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 2, pp. II–1. IEEE (2003)

    Google Scholar 

  23. Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: 2004 Proceedings of Acoustics, Speech, and Signal Processing, (ICASSP 2004), vol. 1, pp. I–577. IEEE (2004)

    Google Scholar 

  24. Schuller, B., Steidl, S., Batliner, A.: The INTERSPEECH 2009 emotion challenge. In: Tenth Annual Conference of the International Speech Communication Association (2009)

    Google Scholar 

  25. Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C., Narayanan, S.: The INTERSPEECH 2010 paralinguistic challenge. In: 2010 Proceedings of INTERSPEECH, Makuhari, Japan, pp. 2794–2797 (2010)

    Google Scholar 

  26. Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., Chetouani, M., Weninger, F., Eyben, F., Marchi, E., et al.: The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In: Proceedings of 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013, Lyon, France (2013)

    Google Scholar 

  27. Székely, E., Cabral, J.P., Cahill, P., Carson-Berndsen, J.: Clustering expressive speech styles in audiobooks using glottal source parameters. In: 12th Annual Conference of the International Speech Communication Association (2011)

    Google Scholar 

  28. Ververidis, D., Kotropoulos, C.: Emotional speech recognition: resources, features, and methods. Speech Commun. 48(9), 1162–1181 (2006)

    Article  Google Scholar 

  29. Wu, D., Parsons, T.D., Narayanan, S.S.: Acoustic feature analysis in speech emotion primitives estimation. In: Eleventh Annual Conference of the International Speech Communication Association (2010)

    Google Scholar 

Download references

Acknowlegments

This work was supported by the research grant “Fondi di ricerca di ateneo 2016” of the university of Genova.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zied Mnasri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mnasri, Z., Rovetta, S., Masulli, F. (2020). Feature Analysis for Emotional Content Comparison in Speech. In: Ju, Z., Yang, L., Yang, C., Gegov, A., Zhou, D. (eds) Advances in Computational Intelligence Systems. UKCI 2019. Advances in Intelligent Systems and Computing, vol 1043. Springer, Cham. https://doi.org/10.1007/978-3-030-29933-0_41

Download citation

Publish with us

Policies and ethics