Skip to main content

Extracting Emotions and Communication Styles from Prosody

  • Conference paper
  • First Online:
Book cover Physiological Computing Systems (PhyCS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8908))

Included in the following conference series:

Abstract

According to many psychological and social studies, vocal messages contain two distinct channels—an explicit, linguistic channel, and an implicit, paralinguistic channel. In particular, the latter contains information about the emotional state of the speaker, providing clues about the implicit meaning of the message. Such information can improve applications requiring human-machine interactions (for example, Automatic Speech Recognition systems or Conversational Agents), as well as support the analysis of human-human interactions (for example, clinic or forensic applications). PrEmA, the tool we present in this work, is able to recognize and classify both emotions and communication style of the speaker, relying on prosodic features. In particular, recognition of communication-styles is, to our knowledge, new, and could be used to infer interesting clues about the state of the interaction. PrEmA uses two LDA-based classifiers, which rely on two sets of prosodic features. Experimenting PrEmA with Italian speakers we obtained \(Ac=71\,\%\) for emotions and \(Ac=86\,\%\) for communication styles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Notice that the performance index provided in this section are indicative and cannot be compared each other, since each system used its own vocal dataset.

  2. 2.

    Such segments were considered too loud for being clear silences, but too quiet for providing a clear voiced signal.

  3. 3.

    http://www.fon.hum.uva.nl/praat/.

  4. 4.

    From the 10 LDA-based classifiers generated for the emotion classification task, the one with better performance indexes was chosen as a final model; the same approach was followed for the communication-style classifier.

References

  1. Anolli, L.: Le emozioni. Edizioni Unicopli, Milano (2002)

    Google Scholar 

  2. Anolli, L., Ciceri, R.: The Voice of Emotions. Angeli, Milano (1997)

    Google Scholar 

  3. Asawa, K., Verma, V., Agrawal, A.: Recognition of vocal emotions from acoustic profile. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics (2012)

    Google Scholar 

  4. Avesani, C., Cosi, P., Fauri, E., Gretter, R., Mana, N., Rocchi, S., Rossi, F., Tesser, F.: Definizione ed annotazione prosodica di un database di parlato-letto usando il formalismo ToBI. In: Proceedings of Il Parlato Italiano, Napoli, Italy, February 2003

    Google Scholar 

  5. Balconi, M., Carrera, A.: Il lessico emotivo nel decoding delle espressioni facciali. ESE - Psychofenia - Salento University Publishing (2005)

    Google Scholar 

  6. Banse, R., Sherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70, 614–636 (1996)

    Article  Google Scholar 

  7. Boersma, P.: Accurate short-term analysis of the fundamental frequency and the Harmonics-to-Noise ratio of a sampled sound. In: Proceedings of Institute of Phonetic Sciences, University of Amsterdam, vol. 17, pp. 97–110 (1993). http://www.fon.hum.uva.nl/paul/papers/Proceedings_1993.pdf

  8. Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5(9/10), 341–345 (2001)

    Google Scholar 

  9. Boersma, P., Weenink, D.: Manual of praat: doing phonetics by computer [computer program] (2013)

    Google Scholar 

  10. Bonvino, E.: Le strutture del linguaggio: un’introduzione alla fonologia. La Nuova Italia, Milano (2000)

    Google Scholar 

  11. Borchert, M., Diisterhoft, A.: Emotions in speech - experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In: IEEE Natural Language Processing and Knowledge Engineering (2005)

    Google Scholar 

  12. Caldognetto, E.M., Poggi, I.: Il parlato emotivo. aspetti cognitivi, linguistici e fonetici. In: Il parlato italiano. Atti del Convegno Nazionale. Napoli, Italy (2004)

    Google Scholar 

  13. Canepari, L.: L’intonazione linguistica e paralinguistica. Liguori Editore (1985)

    Google Scholar 

  14. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)

    Article  Google Scholar 

  15. D’Anna, L., Petrillo, M.: APA: un prototipo di sistema automatico per l’analisi prosodica. In: Atti delle 11me giornate di studio del Gruppo di Fonetica Sperimentale (2001)

    Google Scholar 

  16. Delmonte, R.: SLIM prosodic automatic tools for self-learning instruction. Speech Commun. 30, 145–166 (2000)

    Article  Google Scholar 

  17. Ekman, D., Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford University Press, New York, Oxford (1994)

    Google Scholar 

  18. Gobl, C., Chasaide, A.N.: Testing affective correlates of voice quality through analysis and resynthesis. In: ISCA Workshop on Emotion and Speech (2000)

    Google Scholar 

  19. Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of voice qualities. Acta Otolaryngol. 90(1–6), 441–451 (1980)

    Article  Google Scholar 

  20. Hastie, H.W., Poesio, M., Isard, S.: Automatically predicting dialog structure using prosodic features. Speech Commun. 36(1–2), 63–79 (2001)

    MATH  Google Scholar 

  21. Hirshberg, J., Avesani, C.: Prosodic disambiguation in English and Italian. In: Botinis, A. (ed.) Intonation. Kluwer, Dordrecht (2000)

    Google Scholar 

  22. Hirst, D.: Automatic analysis of prosody for multilingual speech corpora. In: Keller, E., Bailly, G., Terken, J., Huckvale, M. (eds.) Improvements in Speech Synthesis. Wiley, Chichester (2001)

    Google Scholar 

  23. López-de-Ipiña, K., Alonso, J.B., Travieso, C.M., Solé-Casals, J., Egiraun, H., Faundez-Zanuy, M., Ezeiza, A., Barroso, N., Ecay-Torres, M., Martinez-Lage, P., de Lizardui, U.M.: On the selection of non-invasive methods based on speech analysis oriented to automatic alzheimer disease diagnosis. Sensors 13(5), 6730–6745 (2013). http://www.mdpi.com/1424-8220/13/5/6730

    Article  Google Scholar 

  24. Izard, C.E.: The Face of Emotion. Appleton Century Crofts, New York (1971)

    Google Scholar 

  25. Juslin, P.N.: Emotional communication in music performance: a functionalist perspective and some data. Music Percept. 14(4), 383–418 (1997)

    Article  Google Scholar 

  26. Juslin, P.: A Functionalist Perspective on Emotional Communication in Music Performance, 1st edn. Acta Universitatis Upsaliensis, Uppsala (1998)

    Google Scholar 

  27. Koolagudi, S.G., Kumar, N., Rao, K.S.: Speech emotion recognition using segmental level prosodic analysis. In: IEEE, Devices and Communications (ICDeCom) (2011)

    Google Scholar 

  28. Lee, C.M., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)

    Article  Google Scholar 

  29. Leung, C., Lee, T., Ma, B., Li, H.: Prosodic attribute model for spoken language identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2010) (2010)

    Google Scholar 

  30. Mandler, G.: Mind and Body: Psychology of Emotion and Stress. Norton, New York (1984)

    Google Scholar 

  31. McGilloway, S., Cowie, R., Cowie, E.D., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: ISCA Workshop on Speech and Emotion (2000)

    Google Scholar 

  32. McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (2004)

    MATH  Google Scholar 

  33. Mehrabian, A.: Nonverbal Communication. Aldine-Atherton, Chicago (1972)

    Google Scholar 

  34. Michel, F.: Assert Yourself. Centre for Clinical Interventions, Perth (2008)

    Google Scholar 

  35. Moridis, C.N., Economides, A.A.: Affective learning: empathetic agents with emotional facial and tone of voice expressions. IEEE Trans. Affect. Comput. 3(3) (2012)

    Google Scholar 

  36. Murray, E., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93(2), 1097–1108 (1993)

    Article  Google Scholar 

  37. Pinker, S., Prince, A.: Regular and irregular morphology and the psychological status of rules of grammar. In: Lima, S.D., Corrigan, R.L., Iverson, G.K. (eds.) The Reality of Linguistic Rules. John Benjamins Publishing Company, Amsterdam/Philadelphia (1994)

    Google Scholar 

  38. Planet, S., Iriondo, I.: Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In: Information Systems and Technologies (CISTI) (2012)

    Google Scholar 

  39. Pleva, M., Ondas, S., Juhar, J., Cizmar, A., Papaj, J., Dobos, L.: Speech and mobile technologies for cognitive communication and information systems. In: 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom), July 2011, pp. 1–5 (2011)

    Google Scholar 

  40. Purandare, A., Litman, D.: Humor: Prosody analysis and automatic recognition for F * R * I * E * N * D * S *. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006

    Google Scholar 

  41. Russell, J.A., Snodgrass, J.: Emotion and the environment. In: Stokols, D., Altman, I. (eds.) Handbook of Environmental Psychology. Wiley, New York (1987)

    Google Scholar 

  42. Sbattella, L.: La Mente Orchestra. Elaborazione della risonanza e autismo, Vita e pensiero (2006)

    Google Scholar 

  43. Sbattella, L.: Ti penso, dunque suono. Costrutti cognitivi e relazionali del comportamento musicale: un modello di ricerca-azione. Vita e pensiero (2013)

    Google Scholar 

  44. Scherer, K.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44(4), 695–729 (2005)

    Article  Google Scholar 

  45. Shi, Y., Song, W.: Speech emotion recognition based on data mining technology. In: Sixth International Conference on Natural Computation (2010)

    Google Scholar 

  46. Shriberg, E., Stolcke, A.: Prosody modeling for automatic speech recognition and understanding. In: Proceeding of ISCA Workshop on Prosody in Speech Recognition and Understanding (2001)

    Google Scholar 

  47. Shriberg, E., Stolcke, A., Hakkani-Tr, D., Tr, G.: Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun. 32(1–2), 127–154 (2000)

    Article  Google Scholar 

  48. Stern, D.: Il mondo interpersonale del bambino, 1st edn. Bollati Boringhieri, Torino (1985)

    Google Scholar 

  49. Tesser, F., Cosi, P., Orioli, C., Tisato, G.: Modelli prosodici emotivi per la sintesi dell’italiano. ITC-IRST, ISTC-CNR (2004)

    Google Scholar 

  50. Tomkins, S.: Affect theory. In: Sherer, K.R., Ekman, P. (eds.) Approaches to Emotion. Lawrence Erlbaum Associates, Hillsdale (1982)

    Google Scholar 

  51. Wang, C., Li, Y.: A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In: International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2012) (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Colombo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sbattella, L., Colombo, L., Rinaldi, C., Tedesco, R., Matteucci, M., Trivilini, A. (2014). Extracting Emotions and Communication Styles from Prosody. In: da Silva, H., Holzinger, A., Fairclough, S., Majoe, D. (eds) Physiological Computing Systems. PhyCS 2014. Lecture Notes in Computer Science(), vol 8908. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45686-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45686-6_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45685-9

  • Online ISBN: 978-3-662-45686-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics