Abstract
In face-to-face communication, the emotional state of the speaker is transmitted to the listener through a synthetic process that involves both the verbal and thenonverbal modalities of communication. From this point of view, the transmission of the information content is redundant, because the same information is transferred through several channels as well. How much information about the speaker's emotional state is transmitted by each channel and which channel plays the major role in transferring such information? The present study tries to answer these questions through a perceptual experiment that evaluates the subjective perception of emotional states through the single (either visual or auditory channel) and the combined channels (visual and auditory). Results seem to show that, taken separately, the semantic content of the message and the visual content of the message carry the same amount of information as the combined channels, suggesting that each channel performs a robust encoding of the emotional features that is very helpful in recovering the perception of the emotional state when one of the channels is degraded by noise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ali, S. M.,&Silvey, S. D. (1996). A general class of coefficients of divergence of one distribution from another.Journal of Royal Statistic Society,28,131–142.
Apolloni, B., Aversano, G.,&Esposito, A. (2000). Preprocessing and classification of emotional features in speech sentences. In Y. Kosarev (Ed.),Proceedings. IEEE Workshop on Speech and Computer(pp. 49–52).
Apolloni, B., Esposito, A., Malchiodi, D., Orovas, C., Palmas, G.,&Taylor, J. G. (2004). A general framework for learning rules from data.IEEE Transactions on Neural Networks, 15(6), 1333– 1350.
Apple, W.,&Hecht, K. (1982). Speaking emotionally: The relation between verbal and vocal communication of affect.Journal of Personality and Social Psychology,42, 864–875.
Argyle, M. (1988).Bodily communication. New York: Methuen.
Bachorowski, J. A. (1999). Vocal expression and perception of emotion.Current Directions in Psychological Science,8, 53–57.
Banse, R.,&Scherer, K. (1996). Acoustic profiles in vocal emotion expression.Journal of Personality&Social Psychology 70(3), 614–636.
Bartneck, C. (2000).Affective expressions of machines. Eindhoven: Stan Ackerman Institute.
Beattie G. W. (1981). Language and non-verbal communication: The essential synthesis.Linguistics,19, 1165–1183.
Bermond, B., Nieuwenhuyse, B., Fasotti, L.,&Schuerman, J. (1991). Spinal cord lesions, peripheral feedback, and intensities of emotional feelings.Cognition and Emotion 5, 201–220.
Bourbakis, N. G., Esposito, A.,&Kavraki, D. (2006). Analysis of invariant meta-features for learning and understanding disable people's emotional behavior related to their health conditions: A case study. InProceedings of Sixth International IEEE Symposium BioInformatics and BioEngineering, IEEE Computer Society (pp. 357–369).
Bourbakis, N. G., Esposito, A.,&Kavraki, D. (2007). Multi-modal interfaces for interaction-communication between hearing and visually impaired individuals: Problems&issues. InProceedings of the International Conference on Tool for Artificial Intelligence, Patras, Greece, Oct. 29–31 (pp. 1–10).
Bradley, M. M., Cuthbert, B. N.,&Lang, P. J. (1990). Startle reflex modification: Emotion or attention?Psychophysiology,27, 513–523.
Breitenstein, C., Van Lancker, D.,&Daum, I. (2001). The contribution of speech rate and pitch variation to the perception of vocal emotions in a German and an American sample.Cognition&Emotion,15(1), 57–79.
Bryll, R., Quek, F.,&Esposito, A. (2001). Automatic hand hold detection in natural conversation. InProceedings of IEEE Workshop on Cues in Communication, Hawai, December 9.
Burns, K. L.,&Beier, E. G. (1973). Significance of vocal and visual channels in the decoding of emotional meaning.Journal of Communication,23, 118–130.
Butterworth, B. L.,&Beattie, G. W. (1978). Gestures and silence as indicator of planning in speech. In R. N. Campbell&P. T. Smith (Eds.),Recent Advances in the Psychology of Language(pp. 347–360). New York: Olenum Press.
Butterworth, B. L.,&Hadar, U. (1989). Gesture, speech, and computational stages: a reply to McNeill.Psychological Review,96, 168–174.
Caccioppo, J. T., Bush, L. K.,&Tassinary, L. G. (1992). Microexpressive facial actions as functions of affective stimuli: Replication and extension.Personality and Social Psychology Bulletin,18, 515–526.
Caccioppo, J. T., Klein, D. J., Bernston, G. C.,&Hatfield, E. (1993). The psychophysiology of emotion. In J. M. Lewis&M. Haviland-Jones (Eds.),Handbook of Emotion(pp. 119–142). New York: Guilford Press.
Cañamero, D. (1997). Modelling motivations and emotions as a basis for intelligent behavior. InProceedings of International Conference on Autonomous Agents, Marina del Rey, CA (pp. 148– 155).
Cassell, J., Nakano, Y., Bickmore, T., Sidner, C.,&Rich, C. (2001a). Non-verbal cues for discourse structure.Association for Computational Linguistics Joint EACL — ACL Conference(pp. 106– 115).
Cassell, J., Vilhjalmsson, H.,&Bickmore, T. (2001b). BEAT: The behavior expression animation toolkit. InProceedings of SIGGRAPH(pp. 477–486).
Corraze, G. (1980).Les communications nonverbales. Paris: Presses Universitaires de France.
Cosmides, L. (1983). Invariances in the acoustic expressions of emotions during speech.Journal of Experimental Psychology, Human Perception Performance,9, 864–881
Damasio, A. R., Grabowski, T. J., Bechara, A., Damasio, H., Ponto, L. B., Parvizi, J.,&Hichwa, R. D. (2000). Subcortical and cortical brain activity during the feeling of self-generated emotions.Nature Neuroscience,3, 1049–1056.
Doyle, P. (1999). When is a communicative agent a good idea? InProceedings of International Workshop on Communicative and Autonomous Agents, Seattle (pp. 1–2).
Edwards, G. J., Cootes, T. F.,&Taylor, C. J. (1998). Face recognition using active appearance models. InProceedings of the European Conference on Computer Vision,2, 581–695.
Ekman, P. (1984). Expression and the nature of emotion. In K. Scherer&P. Ekman (Eds.),Approaches to emotion(pp. 319–343). Hillsdale, NJ: Lawrence Erlbaum.
Ekman, P. (1989). The argument and evidence about universals in facial expressions of emotion. In H. Wagner&A. Manstead (Eds.),Handbook of social psychophysiology(pp. 143–164). Chichester: Wiley.
Ekman, P. (1992a). Facial expression of emotion: New findings, new questions.Psychological Science,3, 34–38.
Ekman, P. (1992b). An argument for basic emotions.Cognition and Emotion,6, 169–200.
Ekman, P.,&Friesen, W. V. (1967). Head and body cues in the judgement of emotion: A reformulation.Perceptual Motor Skills 24, 711–724.
Ekman, P.,&Friesen, W. V. (1977).Manual for the facial action coding system. Palo Alto: Consulting Psychologists Press.
Ekman, P.,&Friesen, W. V. (1978).Facial action coding system: A technique for the measurement of facial movement. Palo Alto, CA: Consulting Psychologists Press.
Ekman, P., Friesen, W. V.,&Hager, J. C. (2002).The facial action coding system(2nd ed.). Salt Lake City: Weidenfeld&Nicolson.
Elliott, C. D. (1992). The affective reasoner: A process model of emotions in a multi-agent system. Ph.D. Thesis, Institute for the Learning Sciences, Northwestern University, Evanston, Illinois
Esposito, A. (2000). Approaching speech signal problems: An unifying viewpoint for the speech recognition process. In S. Suarez Garcia&R. Baron Fernandez (Eds.),Memoria of Taller Internacional de Tratamiento del Habla, Procesamiento de Vos y el Language. CIC—IPN Obra Compleata, Mexico.
Esposito, A. (2002). The importance of data for training intelligent devices. In B. Apolloni&C. Kurfess (Eds.),From synapses to rules: Discovering symbolic knowledge from neural processed data(pp. 229–250). Dordrecht: Kluwer Academic Press.
Esposito, A. (2007a). The amount of information on emotional states conveyed by the verbal and nonverbal channels: Some perceptual data. In Y. Stilianou et al. (Eds.),Progress in nonlinear speech processing(LNCS 4392, pp. 245–264), New York: Springer-Verlag.
Esposito, A. (2007b). COST 2102: Cross-modal analysis of verbal and nonverbal communication (CAVeNC). In A. Esposito et al. (Eds.),Verbal and nonverbal communication behaviours(LNCS, 4775, pp. 1–10). New York: Springer-Verlag.
Esposito, A., Duncan S.,&Quek F. (2002): Holds as gestural correlated to empty and filled pauses. InProceedings of ICSLP,1(pp. 541–544).
Esposito, A., Esposito, D., Refice, M., Savino, M.,&Shattuck-Hufnagel, S. (2007). A preliminary investigation of the relationships between gestures and prosody in Italian. In A. Esposito et al. (Eds.),Fundamentals of verbal and nonverbal communication and the biometric issue. NATO Publishing Series, Human and Societal Dynamics (18, pp. 4164–4174). The Netherlands: IOS Press.
Esposito, A., Esposito, D.,&Trojano, L. (2007). Gestures, pauses, and aphasia: Four study cases. InProceedings of the IEEE International Symposium on Research on Assistive Technologies, Dayton, OH, April 16 (pp. 59–63).
Esposito, A., Gutierrez-Osuna, R., Kakumanu, P., Garcia, O.N. (2002). Optimal data encoding for speech driven facial animation. Wright State University Tech. Rep. N. CS-WSU-04-02, Dayton, OH.
Esposito, A.,&Marinaro, M. (2007). What pauses can tell us about speech and gesture partnership. In A. Esposito et al. (Eds.),Fundamentals of verbal and nonverbal communication and the biometric issue, NATO Publishing Series, Human and Societal Dynamics (18, pp. 45–57). The Netherlands: IOS Press.
Esposito, A., McCullough K. E., Quek F. (2001). Disfluencies in gesture: Gestural correlates to filled and unfilled speech pauses. InProceedings of IEEE Workshop on Cues in Communication, Hawai, December 9 (pp. 1–6).
Ezzat T., Geiger G.,&Poggio, T. (2002). Trainable videorealistic speech animation. InProceedings of SIGGRAPH, San Antonio, TX, July (pp. 388–397).
Fasel, B.,&Luettin, J. (2003). Automatic facial expression analysis: A survey.Pattern Recognition,36, 259–275.
Feyereisen, P. (1983). Manual activity during speaking in aphasic subjects.International Journal of Psychology,18, 545–556.
Frick, R. (1985). Communicating emotions: The role of prosodic features.Psychological Bulletin,93, 412–429.
Friend, M. (2000). Developmental changes in sensitivity to vocal paralanguage.Developmental Science,3, 148–162.
Frijda, N. H. (1986).The emotions. Cambridge, UK: Cambridge University Press.
Frijda, N. H. (1993). Moods, emotion episodes, and emotions. In J. M. Lewis&M. Haviland-Jones (Eds.),Handbook of emotion(pp. 381–402). New York: Guilford Press.
Fu, S., Gutierrez-Osuna, R., Esposito, A., Kakumanu, P.,&Garcia, O. N. (2005). Audio/visual mapping with cross-modal hidden Markov models.IEEE Transactions on Multimedia,7(2), 243–252.
Fulcher, J. A. (1991). Vocal affect expression as an indicator of affective response.Behavior Research Methods, Instruments,&Computers,23, 306–313.
Gallager, R. G. (1968).Information theory and reliable communication. New York: John Wiley&Sons.
Goldin-Meadow, S. (2003).Hearing gesture: How our hands help us think. Cambridge, MA: Belk-nap Press at Harvard University Press.
Graham, J., Ricci-Bitti, P. E.,&Argyle, M. (1975). A cross-cultural study of the communication of emotion by facial and gestural cues.Journal of Human Movement Studies,1, 68–77.
Gutierrez-Osuna, R., Kakumanu, P., Esposito, A., Garcia, O.N., Bojorquez, A., Castello, J.,&Rudomin I. (2005). Speech-driven facial animation with realistic dynamic.IEEE Transactions on Multimedia,7(1), 33–42.
Hozjan, V.,&Kacic, Z. (2003). Context-independent multilingual emotion recognition from speech signals.International Journal of Speech Technology,6, 311–320.
Hozjan, V.,&Kacic, Z. (2006). A rule-based emotion-dependent feature extraction method for emotion analysis from speech.Journal of the Acoustical Society of America,119(5), 3109– 3120.
Huang, C. L.,&Huang, Y. M. (1997). Facial expression recognition using model-based feature extraction and action parameters classification.Journal of Visual Commumication and Image Representation,8(3), 278–290.
Izard, C. E. (1979). The maximally discriminative facial movement coding system (MAX). Unpublished manuscript. Available from Instructional Resource Center, University of Delaware.
Izard, C. E. (1992). Basic emotions, relations among emotions, and emotion—cognition relations.Psychological Review,99, 561–565.
Izard, C. E. (1993). Organizational and motivational functions of discrete emotions. In J. M. Lewis&M. Haviland-Jones (Eds.),Handbook of emotions(pp. 631–641). New York: Guilford Press.
Izard, C. E. (1994). Innate and universal facial expressions: Evidence from developmental and cross-cultural research.Psychological Bulletin,115, 288–299.
Izard, C. E., Dougherty, L. M.,&Hembree, E. A. (1983). A system for identifying affect expressions by holistic judgments. Unpublished manuscript. Available from Instructional Resource Center, University of Delaware.
Kähler K., Haber, J.,&Seidel, H. (2001). Geometry-based muscle modeling for facial animation. InProceedings of the International Conference on Graphics Interface(pp. 27–36).
Kakumanu, P. (2006). Detecting faces and recognizing facial expressions. PhD thesis. Wright State University, CSE Department, Dayton, OH.
Kakumanu, P., Esposito, A., Garcia, O. N.,&Gutierrez-Osuna, R. (2006). A comparison of acoustic coding models for speech-driven facial animation.Speech Commumication,48, 598–615.
Kakumanu, P., Gutierrez-Osuna R., Esposito, A., Bryll, R., Goshtasby A.,&Garcia, O. N. (2001). Speech driven facial animation. InProceedings of ACM Workshop on Perceptive User Interfaces, Orlando, 15–16 November (pp. 1–4).
Kanade, T., Cohn, J.,&Tian, Y. (2000). Comprehensive database for facial expression analysis. InProceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition(pp. 46–53).
Kendon, A. (2004).Gesture: Visible action as utterance. Cambridge, UK: Cambridge University Press.
Klasmeyer, G.,&Sendlmeier W. F. (1995). Objective voice parameters to characterize the emotional content in speech. In K. Elenius&P. Branderudf (Eds.),Proceedings of ICPhS 1995, (1, pp. 182–185). Arne Strömbergs Grafiska, Stockholm.
Koda, T. (1996). Agents with faces: A study on the effect of personification of software agents. Masters thesis, MIT Media Lab, Cambridge, MA, Stockholm.
Krauss, R., Chen, Y.,&Gottesman, R. F. (2000). Lexical gestures and lexical access: A process model. In D. McNeill (Ed.),Language and gesture(pp. 261–283). UK: Cambridge University Press.
Krauss, R., Morrel-Samuels, P.,&Colasante, C. (1991). Do conversational hand gestures communicate?Journal of Personality and Social Psychology,61(5), 743–754.
Levenson, R. W. (1992). Autonomic nervous system differences among emotions.Psychological Science,3, 23–27.
Levenson, R. W. (1994). Human emotion: A functional view. In P. Ekman&R. J. Davidson (Eds.),The nature of emotion: Fundamental questions(pp. 123–126). New York: Oxford University Press.
Littlewort, G., Bartlett, M. S., Fasel, I., Susskind, J.,&Movellan, J. R. (2004). Dynamics of facial expression extracted automatically from video. InIEEE Conference on Computer Vision and Pattern Recognition, http://citeseer.ist.psu.edu/711804.html.
MacLean, P. (2000). Cerebral evolution of emotion. In J. M. Lewis&M. Haviland-Jones (Eds.),Handbook of emotions(pp. 67–83). New York: Guilford Press.
Massaro, D. W. (1998).Perceiving talking faces. Cambridge, MA: MIT Press.
McNeill, D. (2005).Gesture and thought. Chicago: University of Chicago Press.
Morishima, S. (2001). Face analysis and synthesis.IEEE Signal Processing Magazine,18(3), 26–34.
Navas, E., Hernáez, I.,&Luengo, I. (2006). An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS.IEEE Transactions on Audio, Speech, and Language Processing,14(4), 1117–1127.
Nushikyan, E. A. (1995). Intonational universals in texual context. In K. Elenius&P. Branderudf (Eds.),Proceedings of ICPhS 1995 (1, pp. 258–261). Arne Strömbergs Grafiska, Stockholm.
Oatley, K., Jenkins, J. M. (2006).Understanding emotions(2nd ed.). Oxford, England: Blackwell.
Ostermann, J. (1998). Animation of synthetic face in MPEG-4. InProceedings of Computer Animation(pp. 49–51). Philadelphia, June 8–10.
Overview of the MPEG-4 Standard (1999). ISO/IEC JTC1/SC29/WG11 M2725, Seoul, South Korea.
Panksepp, J. (2000). Emotions as natural kinds within the mammalian brain. In J. M. Lewis and M. Haviland-Jones (Eds.),Handbook of emotions(2nd ed., pp. 137–156). New York: Guilford Press.
Panksepp, J. (2003). At the interface of the affective, behavioral, and cognitive neurosciences: Decoding the emotional feelings of the brain.Brain and Cognition 52, 4–14.
Pantic, M., Patras, I.,&Rothkrantz, J. M. (2002). Facial action recognition in face profile image sequences. InProceedings IEEE International Conference Multimedia and Expo(pp. 37–40).
Pantic, M.,&Rothkrantz, J. M. (2000). Expert system for automatic analysis of facial expression.Image and Vision Computing Journal,18(11), 881–905.
Petkov, N.,&Wieling, M. B. (2004). Gabor filtering augmented with surround inhibition for improved contour detection by texture suppression.Perception,33, 68c.
Phillips, P. J., Flynn, P. J., Scruggs, T.,&Bowyer, K. W. (2005) Overview of the face recognition grand challenge. InProceedings of IEEE Conference on Computer Vision and Pattern Recognition.
Pittam J.,&Scherer K. R. (1993). Vocal expression and communication of emotion. In J. M. Lewis&M. Haviland-Jones (Eds.),Handbook of emotion(pp. 185–197). New York: Guilford Press.
Plutchik, R. (1966). Emotions as adaptive reactions: Implications for therapy.Psychoanalytic Review, LIII,2: 105–110.
Plutchik, R. (1993). Emotion and their vicissitudes: Emotions and psychopatology. In J. M. Lewis&M. Haviland-Jones (Eds.),Handbook of Emotion(pp. 53–66). New York: Guilford Press.
Rosenberg, B. G.,&Langer, J. (1965). A study of postural gestural communication.Journal of Personality and Social Psychology 2(4), 593–597.
Roth, D., Yang, M.,&Ahuja, N. (2000). A snow-based face detector.Advances in Neural Information Processing Systems,12, 855–861.
Russell, J. A. (1980). A circumplex model of affect.Journal of Personality and Social Psychology,39, 1161–1178.
Scherer, K. (2003). Vocal communication of emotion: A review of research paradigms.Speech Communication 40, 227–256.
Scherer, K. R. (1982).Experiencing emotion: A cross-cultural study, Cambridge: Cambridge University Press.
Scherer, K. R. (1989). Vocal correlates of emotional arousal and affective disturbance. In H. Wagner,&A. Manstead (Eds.)Handbook of social psychophysiology(pp. 165–197). New York: Wiley UK.
Scherer, K. R., Banse, R.,&Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures.Journal of Cross-Cultural Psychology,32, 76–92.
Scherer, K. R., Banse, R., Wallbott, H. G.,&Goldbeck, T. (1991). Vocal cues in emotion encoding and decoding.Motivation and Emotion,15, 123–148.
Scherer, K. R.,&Oshinsky, J. S. (1977). Cue utilization in emotion attribution from auditory stimuli.Motivation and Emotion,1, 331–346.
Schlosberg, H. (1953). Three dimensions of emotion.Psychological Review,61(2), 81–88.
Schwartz, G. E., Ahern, G. L.,&Brown, S. (1979). Lateralized facial muscle response to positive and negative emotional stimuli.Psychophysiology,16, 561–571.
Schwartz, G. E., Fair, P. L., Salt, P., Mandel, M. R.,&Klerman, G. L. (1976). Facial muscle patterning to affective imagery in depressed and non-depressed subjects.Science,192, 489– 491.
Schwartz, G. E., Weinberger, D. A.,&Singer, J. A. (1981). Cardiovascular differentiation of happiness, sadness, anger, and fear following imagery and exercise.Psychosomatic Medicine,43, 343–364.
Shannon, C. E.,&Weaver, W. (1949).Mathematical theory of communication. Chicago, USA: University of Illinois Press.
Sinanovic, S.,&Johnson, D. H. (2007). Toward a theory of information processing.Signal Processing,87, 1326–1344, http://www-ece.rice.edu/~dhj/cv.html#publications.
Stocky, T.,&Cassell, J. (2002). Shared reality: Spatial intelligence in intuitive user interfaces. InProceedings of Intelligent User Interfaces(pp. 224–225), San Francisco.
Tekalp. M.,&Ostermann J. (2000). Face and 2-D mesh animation in MPEG-4.Signal Processing and Image Communication,15, 387–421.
Viola A, P.,&Jones, M. J. (2004). Robust real-time face detection.International Journal of Computer Vision 57(2), 137–154.
Zadeh, L. A. (1988). Fuzzy logic.IEEE Computer,21(4), 83–93.
Zhao, W., Chellappa, R., Phillips, P. J.,&Rosenfeld, A. (2003). Face recognition: A literature survey.ACM Computing Surveys,35(4), 399–458.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag London Limited
About this chapter
Cite this chapter
Esposito, A. (2009). Affect in Multimodal Information. In: Tao, J., Tan, T. (eds) Affective Information Processing. Springer, London. https://doi.org/10.1007/978-1-84800-306-4_12
Download citation
DOI: https://doi.org/10.1007/978-1-84800-306-4_12
Publisher Name: Springer, London
Print ISBN: 978-1-84800-305-7
Online ISBN: 978-1-84800-306-4
eBook Packages: Computer ScienceComputer Science (R0)