Skip to main content
Log in

Virtual agent multimodal mimicry of humans

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

This work is about multimodal and expressive synthesis on virtual agents, based on the analysis of actions performed by human users. As input we consider the image sequence of the recorded human behavior. Computer vision and image processing techniques are incorporated in order to detect cues needed for expressivity features extraction. The multimodality of the approach lies in the fact that both facial and gestural aspects of the user’s behavior are analyzed and processed. The mimicry consists of perception, interpretation, planning and animation of the expressions shown by the human, resulting not in an exact duplicate rather than an expressive model of the user’s original behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  • Ambady, N., & Rosenthal, R. (1992). Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111(2), 256–274.

    Article  Google Scholar 

  • Baenziger, T., Pirker, H., & Scherer, K. (2006). Gemep - Geneva multimodal emotion portrayals: A corpus for the study of multimodal emotional expressions. In L. Devillers et al. (Eds.), Proceedings of LREC’06 Workshop on corpora for research on emotion and affect (pp. 15–19). Italy: Genoa.

    Google Scholar 

  • Byun, M., & Badler, N. (2002). Facemote: Qualitative parametric modifiers for facial animations. In Symposium on Computer Animation, San Antonio, TX.

  • Caridakis, G., Castellano, G., Kessous, L., Raouzaiou, A., Malatesta, L., Asteriadis, S., & Karpouzis, K. (2007). Multimodal emotion recognition from expressive faces, body gestures and speech. In Proceedings of the 4th IFIP Conference on Artificial Intelligence Applications and Innovations (AIAI) 2007, Athens, Greece.

  • Caridakis, G., Malatesta, L., Kessous, L., Amir, N., Raouzaiou, A., & Karpouzis, K. (2006). Modeling naturalistic affective states via facial and vocal expressions recognition. In International Conference on Multimodal Interfaces (ICMI’06), Banff, Alberta, Canada, November 2–4, 2006.

  • Chartrand, T. L., Maddux, W., & Lakin, J. (2005). Beyond the perception-behavior link: The ubiquitous utility and motivational moderators of nonconscious mimicry. In R. Hassin, J. Uleman, & J. A. Bargh (Eds.), The new unconscious (pp. 334–361). New York, NY: Oxford University Press.

    Google Scholar 

  • Chi, D., Costa, M., Zhao, L., & Badler, N. (2000). The emote model for effort and shape. In ACM SIGGRAPH ’00, pp. 173–182, New Orleans, LA.

  • Donato, G., Bartlett, M., Hager, J., Ekman, P., & Sejnowski, T. (1999). Classifying facial actions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(10), 974–989.

    Article  Google Scholar 

  • Ekman, P. (1999). Basic emotions. In T. Dalgleish & M. J. Power (Eds.), Handbook of cognition & emotion (pp. 301–320). New York: John Wiley.

    Chapter  Google Scholar 

  • Ekman, P., & Friesen, W. (1969). The repertoire of nonverbal behavioral categories – origins, usage, and coding. Semiotica, 1, 49–98.

    Google Scholar 

  • Ekman, P. & Friesen, W. (1978). The facial action coding system. San Francisco, CA: Consulting Psychologists Press.

  • Hartmann, B., Mancini, M., & Pelachaud, C. (2002). Formational parameters and adaptive prototype instantiation for MPEG-4 compliant gesture synthesis. In Computer Animation’02, Geneva, Switzerland. IEEE Computer Society Press.

  • Hartmann, B., Mancini, M., & Pelachaud, C. (2005a). Implementing expressive gesture synthesis for embodied conversational agents. In Gesture Workshop, Vannes.

  • Hartmann, B., Mancini, M., Buisine, S., & Pelachaud, C. (2005b). Design and evaluation of expressive gesture synthesis for embodied conversational agents. In AAMAS’05. Utretch.

  • Ioannou, S., Raouzaiou, A., Tzouvaras, V., Mailis, T., Karpouzis, K., & Kollias, S. (2005). Emotion recognition through facial expression analysis based on a neurofuzzy network. Special Issue on Emotion: Understanding & Recognition, Neural Networks, 18(4), 423–435.

    Google Scholar 

  • Juslin, P., & Scherer, K. (2005). Vocal expression of affect. In J. Harrigan, R. Rosenthal, & K. Scherer (Eds.), The new handbook of methods in nonverbal behavior research. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Kochanek, D. H., & Bartels, R. H. (1984). Interpolating splines with local tension, continuity, and bias control. In H. Christiansen (Ed.), Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH’ 84 (pp. 33–41). New York, NY: ACM. http://doi.acm.org/10.1145/800031.808575.

  • Kopp, S., Sowa, T., & Wachsmuth, I. (2003). Imitation games with an artificial agent: From mimicking to understanding shape-related iconic gestures. In Gesture Workshop, pp. 436–447.

  • Lakin, J., Jefferis, V., Cheng, C., & Chartrand, T. (2003). The Chameleon effect as social Glue: Evidence for the evolutionary significance of nonconscious mimicry. Journal of Nonverbal Behavior, 27(3), 145–162.

    Article  Google Scholar 

  • Martin, J.-C., Abrilian, S., Devillers, L., Lamolle, M., Mancini, M., & Pelachaud, C. (2005). Levels of representation in the annotation of emotion for the specification of expressivity in ECAs. In International Working Conference on Intelligent Virtual Agents, Kos, Greece, pp. 405–417.

  • Ong, S., & Ranganath, S. (2005). Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 873–891.

    Article  Google Scholar 

  • Oviatt, S. (1999). Ten myths of multimodal interaction. Communications of the ACM, 42(11), 74–81.

    Article  Google Scholar 

  • Pelachaud, C., & Bilvi, M. (2003). Computational model of believable conversational agents. In M.-P. Huget (Ed.), Communication in multiagent systems, Vol. 2650 of Lecture notes in Computer Science (pp. 300–317). Springer-Verlag.

  • Peters, C. (2005). Direction of attention perception for conversation initiation in virtual environments. In International Working Conference on intelligent virtual agents, Kos, Greece, pp. 215–228.

  • Raouzaiou, A., Tsapatsoulis, N., Karpouzis, K., & Kollias, S. (2002). Parameterized facial expression synthesis based on MPEG-4. EURASIP Journal on Applied Signal Processing, 1(Jan), 1021–1038. http://dx.doi.org/10.1155/S1110865702206149

    Google Scholar 

  • Rapantzikos, K., & Avrithis, Y. (2005). An enhanced spatiotemporal visual attention model for sports video analysis. In International Workshop on content-based Multimedia indexing (CBMI), Riga, Latvia.

  • Scherer, K., & Ekman, P. (1984). Approaches to emotion. Hillsdale: Lawrence Erlbaum Associates.

  • Tekalp, A., & Ostermann, J. (2000). Face and 2-d mesh animation in mpeg-4. Signal Processing: Image Communication, 15, 387–421.

    Article  Google Scholar 

  • van Swol, L. (2003) The effects of nonverbal mirroring on perceived persuasiveness, agreement with an imitator, and reciprocity in a group discussion. Communication Research, 30(4), 461–480.

    Article  Google Scholar 

  • Wallbott, H. G., & Scherer, K. R. (1986). Cues and channels in emotion recognition. Journal of Personality and Social Psychology, 51(4), 690–699.

    Article  Google Scholar 

  • Wexelblat, A. (1995). An approach to natural gesture in virtual environments. ACM Transactions on Computer-Human Interaction, 2, 179–200.

    Article  Google Scholar 

  • Whissel, C. M. (1989). The dictionary of affect in language. In R. Plutchnik & H. Kellerman (Eds.), Emotion: Theory, research and experience: Vol. 4, The measurement of emotions. New York: Academic Press.

    Google Scholar 

  • Williams G. W. (1976). Comparing the joint agreement of several raters with another rater. Biometrics, 32, 619–627.

    Article  Google Scholar 

  • Wu, Y., & Huang, T. (2001). Hand modeling, analysis, and recognition for vision-based human computer interaction. IEEE Signal Processing Magazine, 18, 51–60.

    Article  Google Scholar 

  • Wu, Y., & Huang, T. S. (1999). Vision-based gesture recognition: A review. In The 3rd gesture workshop, Gif-sur-Yvette, France, pp. 103–115.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Caridakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Caridakis, G., Raouzaiou, A., Bevacqua, E. et al. Virtual agent multimodal mimicry of humans. Lang Resources & Evaluation 41, 367–388 (2007). https://doi.org/10.1007/s10579-007-9057-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-007-9057-1

Keywords

Navigation