Skip to main content
Log in

An annotation scheme for conversational gestures: how to economically capture timing and form

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

The empirical investigation of human gesture stands at the center of multiple research disciplines, and various gesture annotation schemes exist, with varying degrees of precision and required annotation effort. We present a gesture annotation scheme for the specific purpose of automatically generating and animating character-specific hand/arm gestures, but with potential general value. We focus on how to capture temporal structure and locational information with relatively little annotation effort. The scheme is evaluated in terms of how accurately it captures the original gestures by re-creating those gestures on an animated character using the annotated data. This paper presents our scheme in detail and compares it to other approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. We used a slightly different lexicon in Kipp (2004). However, since our current lexicon is smaller than the one used in Kipp (2004), the task for the reported figure was actually more difficult.

References

  • Buisine, S., Abrilian, S., Niewiadomski, R., Martin, J.-C., Devillers, L., & Pelachaud, C. (2006). Perception of blended emotions: From video corpus to expressive agent. In Proc. of the 6th International Conference on Intelligent Virtual Agents, pp. 93–106.

  • Calbris, G. (1990). Semiotics of French Gesture. Indiana University Press.

  • Cassell, J., Vilhjálmsson, H., & Bickmore, T. (2001). BEAT: The behavior expression animation toolkit. In Proceedings of SIGGRAPH, pp. 477–486.

  • Chafai, N., Pelachaud, C., Pelé, D., & Breton, G. (2006). Gesture expressivity modulations in an ECA application. In Proceedings of the 6th International Conference on Intelligent Virtual Agents, Springer.

  • Chi, D. M., Costa, M., Zhao L., & Badler, N. I. (2000). The EMOTE model for effort and shape. In Proc. of SIGGRAPH, pp. 173–182.

  • Frey, S. (1999). Die Macht des Bildes. Bern: Verlag Hans Huber.

    Google Scholar 

  • Frey, S., Hirsbrunner, H. P., Florin, A., Daw, W., & Crawford, R. (1983). A unified approach to the investigation of nonverbal and verbal behavior in communication research. In W. Doise & S. Moscovici (Eds.), Current issues in European Social Psychology (pp. 143–199). Cambridge University Press.

  • Hartmann, B., Mancini, M., Buisine, S., & Pelachaud, C. (2005). Design and evaluation of expressive gesture synthesis for embodied conversational agents. In Proc. of the 4th international joint conference on Autonomous agents and multiagent systems, ACM Press.

  • Kendon, A. (1996). An agenda for gesture studies. The Semiotic Review of Books, 7(3), 8–12.

    Google Scholar 

  • Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge University Press.

  • Kipp, M. (2001). Anvil – A generic annotation tool for multimodal dialogue. In Proc. Eurospeech, pp. 1367–1370.

  • Kipp, M. (2004). Gesture generation by imitation: From human behavior to computer character animation. Dissertation.com, Boca Raton, Florida.

  • Kipp, M., Neff, M., Kipp, K. H., & Albrecht, I. (2007). Towards natural gesture synthesis: Evaluating gesture units in a data-driven approach to gesture synthesis. In Proc. of the 7th International Conference on Intelligent Virtual Agents, Springer.

  • Kita, S., van Gijn, I., & van der Hulst, H. (1998). Movement phases in signs and co-speech gestures, and their transcription by human coders. In I. Wachsmuth & M. Fröhlich (Eds.), Gesture and sign language in human-computer interaction (pp. 23–35). Springer.

  • Kopp, S., Krenn, B., Marsella, S., Marshall, A., Pelachaud, C., Pirker, H., Thorisson, K., & Vilhjalmsson, H. (2006). Towards a common framework for multimodal generation in ECAs: The behavior markup language. In Proc. of the IVA-06, Springer, pp. 205–217.

  • Kopp, S., Tepper, P., & Cassell, J. (2004). Towards integrated microplanning of language and iconic gesture for multimodal output. In Proc. Intl. Conf. Multimodal Interfaces; pp. 97–104.

  • Krämer, N. C., Tietz, B., & Bente, G. (2003). Effects of embodied interface agents and their gestural activity. In Proc. of the 4th International Conference on Intelligent Virtual Agents, Springer.

  • Loehr, D. (2004). Gesture and intonation. Doctoral Dissertation, Georgetown University.

  • Martell, C. (2002). FORM: An extensible, kinematically-based gesture annotation scheme. In Proc. ICSLP-02, pp. 353–356.

  • Martin, J.-C., Niewiadomski, R., Devillers, L., Buisine, S., & Pelachaud, C. (2006). Multimodal complex Emotions: Gesture expressivity and blended facial expressions In International Journal of Humanoid Robotics, World Scientific Publishing Company.

  • McNeill, D. (1992). Hand and mind: What gestures reveal about thought, University of Chicago Press.

  • McNeill, D. (2005). Gesture & thought. University of Chicago Press.

  • Neff, M., Kipp, M., Albrecht, I., & Seidel, H.-P. (2008). Gesture modeling and animation based on a probabilistic recreation of speaker behavior. ACM Transactions on Graphics. ACM Press.

  • Prillwitz, S., Leven, R., Zienert, H., Hanke, T., & Henning, J. (1989). Hamburg notation system for sign languages: An introductory guide. In International Studies on Sign Language and Communication of the Deaf, Signum Press.

  • Rist, T., André, E., Baldes, S., Gebhard, P., Klesen, M., Kipp, M., Rist, P., & Schmitt, M. (2003). A review of the development of embodied presentation agents and their application fields. In H. Prendinger & M. Ishizuka (Eds.), Life-like characters - Tools, affective functions, and applications (pp. 377–404). Springer.

  • Schegloff, E. (1984). On some gestures’ relation to talk. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action (pp. 266–298). Cambrige University Press.

  • Vilhjalmsson, H., Cantelmo, N., Cassell, J., Chafai, N. E., Kipp, M., Kopp, S., Mancini, M., Marsella, S., Marshall, A. N., Pelachaud, C., Ruttkay, Z., Thórisson, K. R., van Welbergen, H., & van der Werf, R. J. (2007). The behavior markup language: Recent developments and challenges. In Proc. of the 7th International Conference on Intelligent Virtual Agents, Springer.

  • Webb, R. (1997). Linguistic properties of metaphoric gestures. PhD thesis, New York: University of Rochester.

  • Wegener Knudsen, M., Martin, J.-C., Dybkjær, L., Machuca Ayuso, M., Bernsen, N.O., Carletta, J., Heid, U., Kita, S., Llisterri, J., Pelachaud, C., Poggi, I., Reithinger, N., van Elswijk, G., & Wittenburg, P. (2002). Survey of multimodal annotation schemes and best practice. ISLE Deliverable D9.1. http://isle.nis.sdu.dk/reports/wp9/.

Download references

Acknowledgements

The authors would like to thank all reviewers for their kind and helpful remarks. This research was partially funded by the German Ministry of Research and Technology (BMBF) under grant 01 IMB 01A (VirtualHuman). The responsibility for the contents of this paper lies with the authors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Kipp.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kipp, M., Neff, M. & Albrecht, I. An annotation scheme for conversational gestures: how to economically capture timing and form. Lang Resources & Evaluation 41, 325–339 (2007). https://doi.org/10.1007/s10579-007-9053-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-007-9053-5

Keywords

Navigation