Abstract
The paper investigates the relation between emotions and feedback facial expressions in video and audio recorded Danish dyadic first encounters. In particular, we train a classifier on the manual annotations of the corpus in order to investigate to which extent the encoding of emotions contribute to the prediction of the feedback functions of facial expressions. This work builds upon and extends previous research on (a) the annotation and analysis of emotions in the corpus in which it was suggested that emotions are related to specific communicative functions, and (b) the prediction of feedback head movements using multimodal information. The results of the experiments show that information on multimodal behaviours which co-occur with the facial expressions improve the classifier performance. The improvement of the F-measure with respect to the unimodal baseline is of 0.269 and this result is parallel to that obtained for head movements in the same corpus. The experiments also show that the annotations of emotions contribute further to the prediction of feedback facial expressions confirming their relation. The best results are obtained training the classifier on the shape of facial expressions and co-occurring head movements, emotion labels, the gesturer’s and the interlocutor’s speech and can be used in applied systems.


Similar content being viewed by others
Notes
Significance is calculated as corrected paired t-test and the significance threshold is 0.05.
These speech tokens are sometime called non lexicalised since they do usually not occur as lemmaes in traditional dictionaries.
Significance is calculated with corrected two tailed paired t-test: \(t(99)~=~2.362\) and \(p~=~0.02\).
\(t(99)~=~1.9842\) and \(p~<~0.05\).
\(t(99)~=~1.661\) and \(0.5~<~p~<~0.1\).
References
Allwood J, Cerrato L, Jokinen K, Navarretta C, Paggio P (2007) He mumin coding scheme for the annotation of feedback, turn management and sequencing. Multimodal corpora for modelling human multimodal behaviour. Int J Lang Resour Evaluation 41(3–4):273–287
Allwood J, Nivre J, Ahlsén E (1992) On the semantics and pragmatics of linguistic feedback. J Semant 9:1–26
Baranyi P, Csapó A (2012) Definition and synergies of cognitive infocommunications. Acta Polytechnica Hungarica 9(1):67–83
Black M, Yacoob Y (1997) Recognizing facial expressions in image sequences using local parameterized models of image motion. Int J Comput Vis 25(1):23–48
Boersma P, Weenink D (2013) Praat: doing phonetics by computer. http://www.praat.org/. Retrieved 2013.
Bourbakis N, Esposito A, Kavraki D (2011) Extracting and associating meta-features for understanding people’s emotional behaviour: face and speech. Cogn Comput 3:436–448
Busso C, Deng Z, Yildirim S, Bulut M, Lee C, Kazemzaeh A, Lee S, Neumann U, Narayanan S (2004) Analysis of emotion recognition using facial expressions, speech and multimodal information. In: Proceedings of ACM 6th international conference on multimodal interfaces—ICMI 2004, State College, pp 205–211
Camurri A, Lagerlöf I, Volpe G (2003) Recognizing emotion from dance movement: comparison of spectator recognition and automated techniques. Int J Hum-Comput Stud 59(1–2):213–225
Castellano g, Villalba SD, Camurri A (2007) In: Paiva A, Prada R, Picard R (eds) Recognising human emotions from body movement and gesture dynamics., ACII 2007, number 4738 in LNCS Springer, Berlin, pp 71–82
Cerrato L (2007) Investigating communicative feedback phenomena across languages and modalities. PhD thesis, Stockholm, KTH, Speech and Music Communication
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46
Cowie R, Douglas-Cowie E (1996) Automatic statistical analysis of the signal and prosodic signs of emotion in speech. In: Proceedings of international conference on spoken language processing, pp 1989–1992
Eisenberg N, Fabes RA (1992) Emotion, regulation, and the development of social competence. In: Clark MS (ed) Emotion and social behavior. Review of personality and social psychology, vol 14. Sage, Newbury Park, pp 119–150
Ekman P, Friesen W (1975) Unmasking the face. A guide to recognizing emotions from facial clues. Prentice-Hall, Englewood Cliffs
Esposito A, Riviello MT (2011) The cross-modal and cross-cultural processing of affective information. In: Proceedings of the 2011 conference on Neural Nets WIRN10: proceedings of the 20th Italian workshop on neural nets. IOS Press, Amsterdam, pp 301–310
Fujie S, Ejiri Y, Nakajima K, Matsusaka Y, Kobayashi T (2004) A conversation robot using head gesture recognition as para-linguistic information. In: Proceedings of the 13th IEEE international workshop on robot and human interactive, communication, pp 159–164
Ioannou S, Raouzaiou A, Tzouvaras V, Mailis T, Karpouzis K, Kollias S (2005) Emotion recognition through facial expression analysis based on a neurofuzzy network. Neural Netw 18(4):423–435
Kipp M (2004) Gesture generation by imitation—from human behavior to computer character animation. PhD thesis, Saarland University, Saarbruecken, Germany, Boca Raton, Florida, Dissertation.com
Kipp M, Martin J-C (2009) Gesture and emotion: can basic gestural form features discriminate emotions? In: Proceedings of the international conference on affective computing and intelligent interaction (ACII-09). IEEE Press, New York, pp 1–8
Lazarus R (August 1991) Progress on a cognitive-motivational-relational theory of emotion. Am Psychol 46(8):819–834
Levine LJ, Burgess SL (1997) Beyond general arousal: effects of specific emotions on memory. Soc Cogn 15(3):157–181
Lu J, Allwood J, Ahlsén E (Nov 2011) A study on cultural variations of smile based on empirical recordings of Chinese and Swedish first encounters. In: Heylen D, Kipp M, Paggio P (eds) Proceedings of the workshop on multimodal corpora at ICMI-MLMI 2011. Alicante, Spain
Martin JC, Caridakis G, Devillers L, Karpouzis K, Abrilian S (2006) Manual annotation and automatic image processing of multimodal emotional behaviours: validating the annotation of tv interviews. In: Language resources and evaluation conference (LREC’2006), Genoa, Italy, pp 24–27
McClave E (2000) Linguistic functions of head movements in the context of speech. J Pragmat 32:855–878
Morency L-P, de Kok I, Gratch J (2009) A probabilistic multimodal approach for predicting listener backchannels. Autonomous agents and multi-agent systems, vol 20. Springer, Berlin , pp 70–84
Morency L-P, Sidner C, Lee C, Darrell T (2005) Contextual recognition of head gestures. In: Proceedings of the international conference on multi-modal, interfaces, pp 18–24
Morency L-P, Sidner C, Lee C, Darrell T (2007) Head gestures for perceptual interfaces: the role of context in improving recognition. Artif Intell 171(8–9):568–585
Navarretta C (2012) Annotating and analyzing emotions in a corpus of first encounters. In: IEEE (ed) Proceedings of the 3rd IEEE international conference on cognitive infocommunications, Kosice, Slovakia, pp 433–438
Navarretta C, Ahlsén E, Allwood J, Jokinen K, Paggio P (2011) Creating comparable multimodal corpora for nordic languages. In: Proceedings of the 18th Nordic conference of computational linguistics (NODALIDA), Riga, Latvia, pp 153–160
Navarretta C, Ahlsén E, Allwood J, Jokinen K, Paggio P (2012) Feedback in nordic first-encounters: a comparative study. In: Proceedings of LREC 2012, Istanbul, Turkey, pp 2494–2499
Navarretta C, Paggio P (2010) Classification of feedback expressions in multimodal data. In: Proceedings of the 48th annual meeting of the association for computational linguistics (ACL 2010), Upssala, Sweden, pp 318–324
Navarretta C, Paggio P (2012) Verbal and non-verbal feedback in different types of interactions. In: Proceedings of LREC 2012, Istanbul, Turkey, pp 2338–2342
Niewiadomski R, Hyniewska S, Pelachaud C (July 2011) Constraint-based model for synthesis of multimodal sequential expressions of emotions. IEEE Trans Affect Comput 2(3):134–146
Ochs M, Sadek D, Pelachaud C (2012) A formal model of emotions for an empathic rational dialog agent. Int J Auton Agents Multi Agent Syst 24(3):410–440
Ortony A, Clore GL, Collins A (1988) The cognitive structure of emotions. Cambridge University Press, Cambridge
Paggio P, Ahlsén E, Allwood J, Jokinen K, Navarretta C (2010) The NOMCO multimodal nordic resource—goals and characteristics. In: Proceedings of LREC 2010, Malta, pp 2968–2973
Paggio P, Navarretta C (2011) Head movements, facial expressions and feedback in Danish first encounters interactions: a culture-specific analysis. In: Stephanidis C (ed) Universal access in human-computer interaction-users diversity. 6th international conference. UAHCI 2011, Held as part of HCI international 2011, number 6766 in LNCS, Orlando, Florida. Springer, Berlin, pp 583–690
Paggio P and Navarretta C (2012) Classifying the feedback function of head movements and face expressions. In: Proceedings of LREC 2012 Workshop Multimodal Corpora—how should multimodal corpora deal with the situation? Istanbul, Turkey, pp 2494–2499
Pantic M, Rothkrantz L (2000) Automatic analysis of facial expressions: the state of the art. IEEE Trans Pattern Anal Mach Intell 22(12):1424–1445
Peirce CS (1931) Collected papers of Charles Sanders Peirce, vol 8. Harvard University Press, Cambridge, pp 1931–1958
Plutchik R, Conte R (1997) (eds) Circumplex models of personality and emotions, 2nd edn. American Psychological Association, Washington, DC
Paggio P, Navarretta C (2013) Head movements, facial expressions and feedback in conversations—empirical evidence from Danish multimodal data. J Multimodal User Interfaces Multimodal Corpora 7(1–2):29–37
Russell J, Mehrabian A (1977) Evidence for a three-factor theory of emotions. J Res Pers 11:273–294
Sallai G (2012) The cradle of the cognitive infocommunications. Acta Polytechnica Hungarica 9(1):171–181
Scherer K (1996) Adding the affective dimension: a new look in speech analysis and synthesis. In: Proceedings of international conference on spoken language processing, pp 1808–1811
Scherer KR (2005) What are emotions? And how can they be measured? Soc Sci Inf 44(4):695–729
Studsgård AL, Navarretta C (2013) Annotating attitudes in the Danish nomco corpus of first encounters. In: Allwood j et al (eds) NEALT Northern European association for language and technology, proceedings of the fourth Nordic symposium of multimodal communication, Gothenburg, Sweden, November 2012, Linköping electronic conference proceedings, pp. 85–90
Uttl B, Ohta N, Siegenthaler AL (2008) (eds) Memory and emotion: interdisciplinary perspectives. Wiley, New york.
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Wundt W (1905) Grundzge der physiologischen Psychologie. Engelmann, Leipzig
Yngve V (1970) On getting a word in edgewise. In: Papers from the sixth regional meeting of the Chicago linguistic society, pp 567–578
Acknowledgments
The NOMCO corpus’ collection and annotation was funded by the NORDCORP program under the Nordic Research Councils for the Humanities and the Social Sciences (NOS-HS) and by the Danish Research Council for the Humanities (FKK). I would like to thank the coders, Anette Luft Studsgård, Sara Andersen, Bjørn Wessel-Tolvig and Magdalena Lis, the NOMCO project’s partners, Elisabeth Ahlsén, Jens Allwood, Kristiina Jokinen and, especially, Patrizia Paggio.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Navarretta, C. Feedback facial expressions and emotions. J Multimodal User Interfaces 8, 135–141 (2014). https://doi.org/10.1007/s12193-013-0145-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12193-013-0145-9