Abstract
Here is a conversation between an interviewer and a subject occurring in an Adult Attachment Interview (Roisman, Tsai, & Chiang, 2004). AUs are facial action units defined in Ekman, Friesen, and Hager (2002).
The interviewer asked: “Now, let you choose five adjective words to describe your childhood relationship with your mother when you were about five years old, or as far back as you remember.”
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adams, R. B&Kleck, R.E(2003). Perceived gaze direction and the processing of facial displays of emotion. Psychological Science, 14, 644–647.
Ambady, N.,&Rosenthal, R. (1992). Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111(2), 256–274.
Balomenos, T., Raouzaiou, A., Ioannou, S., Drosopoulos, A., Karpouzis, K.,&Kollias, S. (2005).Emotion analysis in man-machine interaction systems (LNCS 3361; pp. 318–328). New York:Springer.
Bartlett, M. S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I.,&Movellan, J. (2005), Recognizing facial expression: machine learning and application to spontaneous behavior. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 568–573).
Batliner, A., Fischer, K., Hubera, R., Spilkera, J.,&Noth, E. (2003). How to find trouble in communication. Speech Communication, 40, 117–143.
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., et al. (2004), Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the International Conference on Multimodal Interfaces (pp. 205–211).
Caridakis, G., Malatesta, L., Kessous, L., Amir, N., Paouzaiou, A.&Karpouzis, K. (2006). Modeling naturalistic affective states via facial and vocal expression recognition. In Proceedings of the International Conference on Multimodal Interfaces (pp. 146–154).
Chen, L., Huang, T. S., Miyasato, T.,&Nakatsu, R. (1998). Multimodal human emotion/expression recognition. In Proceedings of the International Conference on Automatic Face and Gesture Recognition (pp. 396–401).
Chen, L. S. (2000), Joint processing of audio-visual information for the recognition of emotional expressions in human-computer interaction, PhD thesis, University of Illinois at Urbana-Champaign, USA.
Cohn, J. F. (2006). Foundations of human computing: Facial expression and emotion. In Proceedings of the International Conference on Multimodal Interfaces (pp. 233–238).
Cohn, J. F., Reed, L. I., Ambadar, Z., Xiao, J.,&Moriyama, T. (2004). Automatic analysis and recognition of brow actions and head motion in spontaneous facial behavior. In Proceedings of the International Conference on Systems, Man&Cybernetics, 1 (pp. 610–616).
Cohn, J. F.,&Schmidt, K. L.(2004). The timing of facial motion in posed and spontaneous smiles.International Journal of Wavelets, Multiresolution and Information Processing, 2, 1–12.
Cowie, R., Douglas-Cowie, E.,&Cox, C. (2005). Beyond emotion archetypes: Databases for emotion modeling using neural networks. Neural Networks, 18, 371–388.
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M.,&Schröder, M. (2000).‘Feeltrace’ An instrument for recording perceived emotion in real time. In Proceedings of the ISCA Workshop on Speech and Emotion (pp. 19–24).
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W.,&Taylor J. G. (2001), Emotion recognition in human-computer interaction, IEEE Signal Processing Magazine, January (pp. 32–80).
Douglas-Cowie, E., Campbell, N., Cowie, R.,&Roach, P. (2003). Emotional speech: Towards a new generation of database. Speech Communication, 40(1–2), 33–60.
Duric, Z., Gray, WD ., Heishman, R., Li, F., Rosenfeld, A., Schoelles, M. J., Schunn, C.,&Wechsler, H. (2002). Integrating perceptual and cognitive modeling for adaptive and intelligent human—computer interaction. Proceedings of the IEEE, 90(7), 1272–1289.
Ekman, P. (Ed.) (1982). Emotion in the human face (2nd ed.). New York: Cambridge University Press.
Ekman, P.,&Friesen, W. V. (1975). Unmasking the face. Englewood Cliffs, NJ: Prentice-Hall.
Ekman, P., Friesen, W. V.,&Hager, J. C. (2002). Facial Action Coding System. Salt Lake City, UT:A Human Face.
Ekman P.,&Rosenberg, E. L. (2005). What the face reveals: Basic and applied studies of spontaneous expression using the facial action coding system (2nd ed.). Oxford University Press,University of Illinois at Urbana-Champaign, USA.
Fragopanagos, F.,&Taylor, J. G. (2005), Emotion recognition in human—computer interaction.Neural Networks, 18, 389–405.
Go, H. J., Kwak, K. C., Lee, D. J.,&Chun, M.G. (2003). Emotion recognition from facial image and speech signal. In Proceedings of the International Conference of the Society of Instrument and Control Engineers (pp. 2890–2895).
Graciarena, M., Shriberg, E., Stolcke, A., Enos, J. H. F.,&Kajarekar, S. (2006). Combining prosodic, lexical and cepstral systems for deceptive speech detection. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, I, 1033–1036.
Gunes, H.,&Piccardi, M. (2005). Affect recognition from face and body: early fusion vs. late fusion. In Proceedings of the International Conference on Systems, Man and Cybernetics (pp. 3437–3443).
Gunes, H.,&Piccardi, M. (2006). A bimodal face and body gesture database for automatic analysis of human nonverbal affective behavior. International Conference on Pattern Recognition, 1,1148–1153.
Harrigan, J. A., Rosenthal, R.,&Scherer, K. R. (2005). The new handbook of methods in nonverbal behavior research (pp. 369–397). Oxford University Press, USA.
Hoch, S., Althoff, F., McGlaun, G.,&Rigoll, G. (2005), Bimodal fusion of emotional data in an automotive environment. In ICASSP, II (pp. 1085–1088).
Ji, Q., Lan, P.,&Looney, C. (2006). A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE SMC-Part A, 36(5), 862–875.
Kapoor, A., Burleson, W.,&Picard, R. W. (2007). Automatic prediction of frustration. International Journal of Human—Computer Studies, 65(8), 724–736.
Kapoor, A.,&Picard, R. W. (2005). Multimodal affect recognition in learning environment. In ACM International Conference on Multimedia (pp. 677–682).
Karpouzis, K., Caridakis, G., Kessous, L., Amir, N., Raouzaiou, A., Malatesta, L.,&Kollias, S.(2007). Modeling naturalistic affective states via facial, vocal, and bodily expression recognition (LNAI 4451; pp. 91–112). New York: Springer.
Kuncheva, L. I. (2004). Combining pattern classifier: Methods and algorithms. Hoboken, NJ: John Wiley and Sons.
Lee, C. M.,&Narayanan, S. S. (2005). Toward detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, 13(2), 293–303.
Liao, W., Zhang, W., Zhu, Z., Ji, Q.,&Gray, W. (2006), Toward a decision-theoretic framework for affect recognition and user assistance. International Journal of Human-Computer Studies,64(9), 847–873.
Lisetti, C. L.,&Nasoz, F. (2002). MAUI: A multimodal affective user interface. In Proceedings of the International Conference on Multimedia (pp. 161–170).
Lisetti, C. L.,&Nasoz, F. (2004). Using noninvasive wearable computers to recognize human emotions from physiological signals. EURASIP Journal on Applied Signal Processing, 11, 1672–1687.
Litman, D. J.,&Forbes-Riley, K. (2004), Predicting student emotions in computer-human tutoring dialogues. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), July (pp. 352–359).
Littlewort, G. C., Bartlett, M. S.,&Lee, K. (2007). Faces of pain: Automated measurement of spontaneous facial expressions of genuine and posed pain. In Proceedings of the ACM International Conference on Multimodal Interfaces (pp. 15–21).
Maat, L.,&Pantic, M. (2006). Gaze-X: Adaptive affective multimodal interface for single-user office scenarios, In Proceedings of the ACM International Conference on Multimodal Interfaces (pp. 171–178).
Pal, P., Iyer, A. N.,&Yantorno, R. E. (2006). Emotion detection from infant facial expressions and cries. In Proceedings of the International Conference on Acoustics, Speech&Signal Processing, 2 (pp. 721–724).
Pantic, M.,&Bartlett, M. S. (2007). Machine analysis of facial expressions. In K. Delac and M. Grgic, (Eds.), Face recognition (pp. 377–416). Vienna, Austria: I-Tech Education.
Pantic, M., Pentland, A., Nijholt, A.,&Huang, T. S. (2006). Human computing and machine understanding of human behavior: A survey. In International Conference on Multimodal Interfaces (pp. 239–248).
Pantic M.,&Rothkrantz, L. J. M. (2003), Toward an affect-sensitive multimodal human-computer interaction, Proceedings of the IEEE, 91(9, Sept.), 1370–1390.
Pantic, M.,&Rothkrantz, L. J. M. (2004). Case-based reasoning for user-profiled recognition of emotions from face images. In International Conference on Multimedia and Expo (pp. 391–394).
Pantic, M., Valstar, M. F, Rademaker, R.,&Maat, L. (2005). Web-based database for facial expression analysis. In International Conference on Multimedia and Expo (pp. 317–321).
Patras, I.,&Pantic, M. (2004). Particle filtering with factorized likelihoods for tracking facial features, In Proceedings of the IEEE International Conference on Face and Gesture Recognition (pp. 97–102).
Pentland, A. (2005). Socially aware, computation and communication, IEEE Computer, 38, 33–40.
Petridis, S.,&Pantic, M. (2008). Audiovisual discrimination between laughter and speech, In IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (pp. 5117–5120).
Picard, R. W. (1997). Affective computing. Cambridge, MA: MIT Press.
Picard, R. W., Vyzas, E.,&Healey, J. (2001). Toward machine emotional intelligence: Analysis of affective physiological state. IEEE Transactions on Pattern Analysis and Machine Intelligence,23(10), 1175–1191.
Pitt, M. K.,&Shephard, N. (1999). Filtering via simulation: auxiliary particle filtering. Journal of the American Statistical Association, 94, 590–599.
Roisman, G. I., Tsai, J. L.,&Chiang, K. S. (2004). The emotional integration of childhood experience: Physiological, facial expressive, and self-reported emotional response during the adult attachment interview. Developmental Psychology, 40(5), 776–789.
Russell, J. A., Bachorowski, J.,&Fernandez-Dols, J. (2003). Facial and vocal expressions of emotion. Ann. Rev. Psychol. 54, 329–349.
Scherer K. R. (1999). Appraisal theory. In T. Dalgleish&M. J. Power (Eds.), Handbook of cognition and emotion, New York: Wiley, 637–663.
Schuller, B., Villar, R. J., Rigoll, G.,&Lang, M. (2005). Meta-classifiers in acoustic and linguistic feature fusion-based affect recognition. In International Conference on Acoustics, Speech, and Signal Processing (pp. 325–328).
Sebe, N., Cohen, I., Gevers, T.,&Huang, T. S. (2006). Emotion recognition based on joint visual and audio cues. In International Conference on Pattern Recognition (pp. 1136–1139).
Sebe, N., Cohen, I.,&Huang, T. S. (2005). Multimodal emotion recognition. In Handbook of Pattern Recognition and Computer Vision. Singapore: World Scientific.
Song, M., Bu, J., Chen, C.,&Li, N. (2004), Audio-visual based emotion recognition—A new approach. In International Conference on Computer Vision and Pattern Recognition (pp. 1020–1025).
Stein, B.,&Meredith, M. A. (1993). The merging of senses. Cambridge, MA: MIT Press.
Stemmler, G. (2003). Methodological considerations in the psychophysiological study of emotion.In R. J. Davidson, K. R. Scherer,&H. H. Goldsmith (Eds.), Handbook of affective sciences (pp. 225–255). Oxford University Press, USA.
Tao, H.,&Huang, T. S. (1999), Explanation-based facial motion tracking using a piecewise Bezier volume deformation mode. In CVPR'99, 1 (pp. 611–617).
Truong, K. P.,&van Leeuwen, D. A. (2007) Automatic discrimination between laughter and speech, Speech Communication, 49, 144–158.
Valstar, M. F., Gunes, H.,&Pantic, M. (2007). How to distinguish posed from spontaneous smiles using geometric features. In ACM Int'l Conf. Multimodal Interfaces (pp. 38–45).
Valstar, M., Pantic, M., Ambadar, Z.,&Cohn, J. F. (2006). Spontaneous vs. posed facial behavior: Automatic analysis of brow actions. In International Conference on Multimedia Interfaces (pp. 162–170).
Valstar, M. F.,&Pantic, M. (2006). Fully automatic facial action unit detection and temporal analysis. Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 3,149.
Wang, Y.,&Guan, L. (2005), Recognizing human emotion from audiovisual information. In ICASSP, II (pp. 1125–1128).
Whissell, C. M. (1989). The dictionary of affect in language. In R. Plutchik&H. Kellerman (Eds.).Emotion: Theory, research and experience. The measurement of emotions (vol. 4; pp. 113–131).New York: Academic Press.
Xiao, J., Moriyama, T., Kanade, T.,&Cohn, J. F. (2003). Robust full-motion recovery of head by dynamic templates and re-registration techniques. International Journal of Imaging Systems and Technology, 13(1), 85–94.
Yoshimoto, D., Shapiro, A., O'Brian, K.,&Gottman, J. M. (2005). Nonverbal communication coding systems of committed couples. In New handbook of methods in nonverbal behavior research, J.A. Harrigan, R. Rosenthal, and K. R. Scherer (Eds.) (pp. 369–397), USA.
Zeng, Z., Hu, Y., Liu, M., Fu, Y.,&Huang, T. S.(2006), Training combination strategy of multi-stream fused hidden markov model for audio-visual affect recognition. In Proceedings of the ACM International Conference on Multimedia (pp. 65–68).
Zeng, Z., Hu, Y., Roisman, G. I., Wen, Z., Fu, Y.,&Huang, T. S. (2007a), Audio-visual spontaneous emotion recognition. In T. S. Huang, A. Nijholt, M. Pantic,&A. Pentland (Eds.) Artificial Intelligence for Human Computing (LNAI 4451, pp. 72–90). New York, Springer.
Zeng, Z., Pantic, M., Roisman, G. I.,&Huang, T. S. (2008a). A survey of affect recognition methods: Audio, visual and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence (in press).
Zeng, Z., Tu, J., Liu, M., Zhang, T., Rizzolo, N., Zhang, Z., Huang, T. S., Roth, D.,&Levinson, S.(2004), Bimodal HCI-related Emotion Recognition, In International Conference on Multi-modal Interfaces (pp. 137–143).
Zeng, Z., Tu, J., Pianfetti, B.,&Huang, T. S. (2008b). Audio-visual affective expression recognition through multi-stream fused HMM. IEEE Transactions on Multimedia, June 2008, 10(4),570–577.
Zeng, Z., Tu, J., Liu, M., Huang, T. S., Pianfetti, B., Roth, D.,&Levinson, S. (2007b). Audio-visual affect recognition. IEEE Transactions on Multimedia, 9(2), 424–428.
Zhang, Y.,&Ji, Q. (2005). Active and dynamic information fusion for facial expression understanding from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5), 699–714.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag London Limited
About this chapter
Cite this chapter
Zeng, Z., Pantic, M., Huang, T.S. (2009). Emotion Recognition Based on Multimodal Information. In: Tao, J., Tan, T. (eds) Affective Information Processing. Springer, London. https://doi.org/10.1007/978-1-84800-306-4_14
Download citation
DOI: https://doi.org/10.1007/978-1-84800-306-4_14
Publisher Name: Springer, London
Print ISBN: 978-1-84800-305-7
Online ISBN: 978-1-84800-306-4
eBook Packages: Computer ScienceComputer Science (R0)