Extracting Emotions and Communication Styles from Prosody

Sbattella, Licia; Colombo, Luca; Rinaldi, Carlo; Tedesco, Roberto; Matteucci, Matteo; Trivilini, Alessandro

doi:10.1007/978-3-662-45686-6_2

Licia Sbattella¹⁷,
Luca Colombo¹⁷,
Carlo Rinaldi¹⁷,
Roberto Tedesco¹⁷,
Matteo Matteucci¹⁷ &
…
Alessandro Trivilini¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8908))

Included in the following conference series:

International Conference on Physiological Computing Systems

715 Accesses
4 Citations
3 Altmetric

Abstract

According to many psychological and social studies, vocal messages contain two distinct channels—an explicit, linguistic channel, and an implicit, paralinguistic channel. In particular, the latter contains information about the emotional state of the speaker, providing clues about the implicit meaning of the message. Such information can improve applications requiring human-machine interactions (for example, Automatic Speech Recognition systems or Conversational Agents), as well as support the analysis of human-human interactions (for example, clinic or forensic applications). PrEmA, the tool we present in this work, is able to recognize and classify both emotions and communication style of the speaker, relying on prosodic features. In particular, recognition of communication-styles is, to our knowledge, new, and could be used to infer interesting clues about the state of the interaction. PrEmA uses two LDA-based classifiers, which rely on two sets of prosodic features. Experimenting PrEmA with Italian speakers we obtained \(Ac=71\,\%\) for emotions and \(Ac=86\,\%\) for communication styles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Notice that the performance index provided in this section are indicative and cannot be compared each other, since each system used its own vocal dataset.
2.
Such segments were considered too loud for being clear silences, but too quiet for providing a clear voiced signal.
3.
http://www.fon.hum.uva.nl/praat/.
4.
From the 10 LDA-based classifiers generated for the emotion classification task, the one with better performance indexes was chosen as a final model; the same approach was followed for the communication-style classifier.

References

Anolli, L.: Le emozioni. Edizioni Unicopli, Milano (2002)
Google Scholar
Anolli, L., Ciceri, R.: The Voice of Emotions. Angeli, Milano (1997)
Google Scholar
Asawa, K., Verma, V., Agrawal, A.: Recognition of vocal emotions from acoustic profile. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics (2012)
Google Scholar
Avesani, C., Cosi, P., Fauri, E., Gretter, R., Mana, N., Rocchi, S., Rossi, F., Tesser, F.: Definizione ed annotazione prosodica di un database di parlato-letto usando il formalismo ToBI. In: Proceedings of Il Parlato Italiano, Napoli, Italy, February 2003
Google Scholar
Balconi, M., Carrera, A.: Il lessico emotivo nel decoding delle espressioni facciali. ESE - Psychofenia - Salento University Publishing (2005)
Google Scholar
Banse, R., Sherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70, 614–636 (1996)
Article Google Scholar
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the Harmonics-to-Noise ratio of a sampled sound. In: Proceedings of Institute of Phonetic Sciences, University of Amsterdam, vol. 17, pp. 97–110 (1993). http://www.fon.hum.uva.nl/paul/papers/Proceedings_1993.pdf
Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5(9/10), 341–345 (2001)
Google Scholar
Boersma, P., Weenink, D.: Manual of praat: doing phonetics by computer [computer program] (2013)
Google Scholar
Bonvino, E.: Le strutture del linguaggio: un’introduzione alla fonologia. La Nuova Italia, Milano (2000)
Google Scholar
Borchert, M., Diisterhoft, A.: Emotions in speech - experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In: IEEE Natural Language Processing and Knowledge Engineering (2005)
Google Scholar
Caldognetto, E.M., Poggi, I.: Il parlato emotivo. aspetti cognitivi, linguistici e fonetici. In: Il parlato italiano. Atti del Convegno Nazionale. Napoli, Italy (2004)
Google Scholar
Canepari, L.: L’intonazione linguistica e paralinguistica. Liguori Editore (1985)
Google Scholar
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)
Article Google Scholar
D’Anna, L., Petrillo, M.: APA: un prototipo di sistema automatico per l’analisi prosodica. In: Atti delle 11me giornate di studio del Gruppo di Fonetica Sperimentale (2001)
Google Scholar
Delmonte, R.: SLIM prosodic automatic tools for self-learning instruction. Speech Commun. 30, 145–166 (2000)
Article Google Scholar
Ekman, D., Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford University Press, New York, Oxford (1994)
Google Scholar
Gobl, C., Chasaide, A.N.: Testing affective correlates of voice quality through analysis and resynthesis. In: ISCA Workshop on Emotion and Speech (2000)
Google Scholar
Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of voice qualities. Acta Otolaryngol. 90(1–6), 441–451 (1980)
Article Google Scholar
Hastie, H.W., Poesio, M., Isard, S.: Automatically predicting dialog structure using prosodic features. Speech Commun. 36(1–2), 63–79 (2001)
MATH Google Scholar
Hirshberg, J., Avesani, C.: Prosodic disambiguation in English and Italian. In: Botinis, A. (ed.) Intonation. Kluwer, Dordrecht (2000)
Google Scholar
Hirst, D.: Automatic analysis of prosody for multilingual speech corpora. In: Keller, E., Bailly, G., Terken, J., Huckvale, M. (eds.) Improvements in Speech Synthesis. Wiley, Chichester (2001)
Google Scholar
López-de-Ipiña, K., Alonso, J.B., Travieso, C.M., Solé-Casals, J., Egiraun, H., Faundez-Zanuy, M., Ezeiza, A., Barroso, N., Ecay-Torres, M., Martinez-Lage, P., de Lizardui, U.M.: On the selection of non-invasive methods based on speech analysis oriented to automatic alzheimer disease diagnosis. Sensors 13(5), 6730–6745 (2013). http://www.mdpi.com/1424-8220/13/5/6730
Article Google Scholar
Izard, C.E.: The Face of Emotion. Appleton Century Crofts, New York (1971)
Google Scholar
Juslin, P.N.: Emotional communication in music performance: a functionalist perspective and some data. Music Percept. 14(4), 383–418 (1997)
Article Google Scholar
Juslin, P.: A Functionalist Perspective on Emotional Communication in Music Performance, 1st edn. Acta Universitatis Upsaliensis, Uppsala (1998)
Google Scholar
Koolagudi, S.G., Kumar, N., Rao, K.S.: Speech emotion recognition using segmental level prosodic analysis. In: IEEE, Devices and Communications (ICDeCom) (2011)
Google Scholar
Lee, C.M., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)
Article Google Scholar
Leung, C., Lee, T., Ma, B., Li, H.: Prosodic attribute model for spoken language identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2010) (2010)
Google Scholar
Mandler, G.: Mind and Body: Psychology of Emotion and Stress. Norton, New York (1984)
Google Scholar
McGilloway, S., Cowie, R., Cowie, E.D., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: ISCA Workshop on Speech and Emotion (2000)
Google Scholar
McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (2004)
MATH Google Scholar
Mehrabian, A.: Nonverbal Communication. Aldine-Atherton, Chicago (1972)
Google Scholar
Michel, F.: Assert Yourself. Centre for Clinical Interventions, Perth (2008)
Google Scholar
Moridis, C.N., Economides, A.A.: Affective learning: empathetic agents with emotional facial and tone of voice expressions. IEEE Trans. Affect. Comput. 3(3) (2012)
Google Scholar
Murray, E., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93(2), 1097–1108 (1993)
Article Google Scholar
Pinker, S., Prince, A.: Regular and irregular morphology and the psychological status of rules of grammar. In: Lima, S.D., Corrigan, R.L., Iverson, G.K. (eds.) The Reality of Linguistic Rules. John Benjamins Publishing Company, Amsterdam/Philadelphia (1994)
Google Scholar
Planet, S., Iriondo, I.: Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In: Information Systems and Technologies (CISTI) (2012)
Google Scholar
Pleva, M., Ondas, S., Juhar, J., Cizmar, A., Papaj, J., Dobos, L.: Speech and mobile technologies for cognitive communication and information systems. In: 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom), July 2011, pp. 1–5 (2011)
Google Scholar
Purandare, A., Litman, D.: Humor: Prosody analysis and automatic recognition for F * R * I * E * N * D * S *. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006
Google Scholar
Russell, J.A., Snodgrass, J.: Emotion and the environment. In: Stokols, D., Altman, I. (eds.) Handbook of Environmental Psychology. Wiley, New York (1987)
Google Scholar
Sbattella, L.: La Mente Orchestra. Elaborazione della risonanza e autismo, Vita e pensiero (2006)
Google Scholar
Sbattella, L.: Ti penso, dunque suono. Costrutti cognitivi e relazionali del comportamento musicale: un modello di ricerca-azione. Vita e pensiero (2013)
Google Scholar
Scherer, K.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44(4), 695–729 (2005)
Article Google Scholar
Shi, Y., Song, W.: Speech emotion recognition based on data mining technology. In: Sixth International Conference on Natural Computation (2010)
Google Scholar
Shriberg, E., Stolcke, A.: Prosody modeling for automatic speech recognition and understanding. In: Proceeding of ISCA Workshop on Prosody in Speech Recognition and Understanding (2001)
Google Scholar
Shriberg, E., Stolcke, A., Hakkani-Tr, D., Tr, G.: Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun. 32(1–2), 127–154 (2000)
Article Google Scholar
Stern, D.: Il mondo interpersonale del bambino, 1st edn. Bollati Boringhieri, Torino (1985)
Google Scholar
Tesser, F., Cosi, P., Orioli, C., Tisato, G.: Modelli prosodici emotivi per la sintesi dell’italiano. ITC-IRST, ISTC-CNR (2004)
Google Scholar
Tomkins, S.: Affect theory. In: Sherer, K.R., Ekman, P. (eds.) Approaches to Emotion. Lawrence Erlbaum Associates, Hillsdale (1982)
Google Scholar
Wang, C., Li, Y.: A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In: International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2012) (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Dip. di Elettronica, Informazione e Biongegneria, Politecnico di Milano, P.zza Leonardo da Vinci 32, Milano, Italy
Licia Sbattella, Luca Colombo, Carlo Rinaldi, Roberto Tedesco, Matteo Matteucci & Alessandro Trivilini

Authors

Licia Sbattella
View author publications
You can also search for this author in PubMed Google Scholar
Luca Colombo
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Rinaldi
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Tedesco
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Matteucci
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Trivilini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luca Colombo .

Editor information

Editors and Affiliations

IT - Instituto de Telecomunicações, Lisbon, Portugal
Hugo Plácido da Silva
Medical University Graz, Graz, Austria
Andreas Holzinger
Liverpool John Moores, Liverpool, Merseyside, United Kingdom
Stephen Fairclough
ETH Zurich, Zurich, Switzerland
Dennis Majoe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sbattella, L., Colombo, L., Rinaldi, C., Tedesco, R., Matteucci, M., Trivilini, A. (2014). Extracting Emotions and Communication Styles from Prosody. In: da Silva, H., Holzinger, A., Fairclough, S., Majoe, D. (eds) Physiological Computing Systems. PhyCS 2014. Lecture Notes in Computer Science(), vol 8908. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45686-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-662-45686-6_2
Published: 28 November 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45685-9
Online ISBN: 978-3-662-45686-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics