Annotation of Multimodal Data

Steininger, Silke; Schiel, Florian; Rabold, Susen

doi:10.1007/3-540-36678-4_35

Silke Steininger⁴,
Florian Schiel⁴ &
Susen Rabold⁴

Part of the book series: Cognitive Technologies ((COGTECH))

664 Accesses
3 Citations

Summary

Do users show emotions and gestures if they interact with a rather intelligent multimodal dialogue system? And if they do, what do the “emotions” and the gestures look like? Are there any features that can be exploited for their automatic detection? And finally, which language do they use when interacting with a multimodal system — does it differ from the usage of language with a monomodal dialogue system that can only understand speech?

To answer these questions, data had to be collected, labeled and analyzed. This chapter deals with the second step, the transliteration and the labeling.

The three main labeling steps are covered: orthographic transliteration, labeling of user states, labeling of gestures. Each step will be described with theoretical and developmental background, an overview of the label categories, and some practical advice for readers who are themselves in the process of looking for or assembling a coding system. Readers who are interested in using the presented labeling schemes should refer to the cited literature — not all details necessary for actually using the different systems are presented here for reasons of space. For information on the corpus itself, please refer to Schiel and Türk (2006).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. Batliner, K. Fischer, R. Huber, J. Spilker, and E. Nöth. Desperately Seeking Emotions Or: Actors, Wizards, and Human Beings. In: Proc. ISCA Workshop on Speech and Emotion, Belfast, Irland, 2000.
Google Scholar
N. Beringer, S. Burger, and D. Oppermann. Lexikon der Transliterationen. Technical Document SmartKom 2, Bavarian Archive for Speech Signals (BAS), 2000.
Google Scholar
R.R. Cornelius. Theoretical Approaches to Emotion. In: Proc. ISCA Workshop on Speech And Emotion, Belfast, Ireland, 2000.
Google Scholar
J. Egan. Signal Detection Theory and ROC Analysis. Academic, New York, 1975.
Google Scholar
P. Ekman. Emotional and Conversational Nonverbal Signals. In: L.S. Messing and R. Campbell (eds.), Gesture, Speech and Sign, pp. 45–55. Oxford University Press, 1999.
Google Scholar
P. Ekman and W.V. Friesen. Facial Action Coding System (FACS). A Technique for the Measurement of Facial Action. Consulting Psychologists Press, 1978.
Google Scholar
G. Faßnacht. Systematische Verhaltensbeobachtung. Reinhardt, Munich, Germany, 1979.
Google Scholar
K. Fischer. Annotating Emotional Language Data. Verbmobil Report 236, DFKI, 1999.
Google Scholar
W.V. Friesen, P. Ekman, and H.G. Wallbott. Measuring Hand Movements. Journal of Nonverbal Behavior, 1:97–112, 1979.
Article Google Scholar
I. Jacobi and F. Schiel. Interlabeller Agreement in SmartKom Multi-Modal Annotations. Technical Document SmartKom 26, Bavarian Archive for Speech Signals (BAS), 2003.
Google Scholar
A. Kendon. Gesticulation and Speech: Two Aspects of the Process. In: M.R. Key (ed.), The Relation Between Verbal and Nonverbal Communication, The Hague, The Netherlands, 1980. Mouton.
Google Scholar
R. Laban. Principles of Dance and Movement Notation. Macdonald & Evans Ltd., London, UK, 1956.
Google Scholar
D. McNeill. Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press, Chicago, IL, 1992.
Google Scholar
A. Mulder. Human Movement Tracking Technology. Hand Centered Studies of Human Movement Project 94-1, School of Kinesiology, Simon Fraser University, Burnaby, Canada, 1994.
Google Scholar
N. Munn. The Effect of Knowledge of the Situation Upon Judgment of Emotion From Facial Expressions. Journal of Abnormal and Social Psychology, 35:324–338, 1940.
Article Google Scholar
D. Oppermann, F. Schiel, S. Steininger, and N. Beringer. Off-Talk-A Problem for Human-Machine-Interaction. In: Proc. EUROSPEECH-01, Aalborg, Denmark, 2001.
Google Scholar
R. Plutchik. Emotion: A Psycho-Evolutionary Synthesis. Harper and Row, New York, 1980.
Google Scholar
C. Ratner. A Social Constructionist Critique of Naturalistic Theories of Emotion. Journal of Mind and Behavior, 10:211–230, 1989.
Google Scholar
F. Schiel and U. Türk. Wizard-of-Oz Recordings, 2006. In this volume.
Google Scholar
M. Sherman. The Differentiation of Emotional Responses in Infants. Journal of Comparative Physiology, pp. 265–284, 1927.
Google Scholar
S. Steininger, O. Dioubina, R. Siepmann, C. Beiras-Cunqueiro, and A. Glesner. Labeling von User-State im Mensch-Maschine Dialog — User-State-Kodierkonventionen SmartKom. Technical Document SmartKom 17, Bavarian Archive for Speech Signals (BAS), 2000.
Google Scholar
S. Steininger, B. Lindemann, and T. Paetzold. Labeling of Gestures in SmartKom-The Coding System. In: I. Wachsmuth and T. Sowa (eds.), Gesture and Sign Languages in Human-Computer Interaction, Int. Gesture Workshop 2001 London, pp. 215–227, Berlin Heidelberg New York, 2001a. Springer.
Google Scholar
S. Steininger, B. Lindemann, and T. Paetzold. Labeling von Gesten im Mensch-Maschine Dialog — Gesten-Kodierkonventionen SmartKom. Technical Document SmartKom 14, Bavarian Archive for Speech Signals (BAS), 2001b.
Google Scholar
S. Steininger, F. Schiel, O. Dioubina, and S. Rabold. Development of User-State Conventions for the Multimodal Corpus in SmartKom. In: Proc. Workshop “Multimodal Resources and Multimodal Systems Evaluation”, pp. 33–37, Las Palmas, Spain, 2002a.
Google Scholar
S. Steininger, F. Schiel, and A. Glesner. Labeling Procedures for the Multi-Modal Data Collection of SmartKom. In: Proc. 3rd Int. Conf. on Language Resources and Evaluation (LREC 2002), pp. 371–377, Las Palmas, Spain, 2002b.
Google Scholar
H. Wallbott. Faces in Context: The Relative Importance of Facial Expression and Context Information in Determining Emotion Attributions. In: K. Scherer (ed.), Facets of Emotion, Mahwah, NJ, 1988. Lawrence Erlbaum.
Google Scholar

Download references

Author information

Authors and Affiliations

Bavarian Archive for Speech Signals c/o Institut für Phonetik und Sprachliche Kommunikation, Ludwig-Maximilians-Universität Münchenn, Germany
Silke Steininger, Florian Schiel & Susen Rabold

Authors

Silke Steininger
View author publications
You can also search for this author in PubMed Google Scholar
Florian Schiel
View author publications
You can also search for this author in PubMed Google Scholar
Susen Rabold
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

German Research Center for AI, DFKI GmbH, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Wolfgang Wahlster

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Steininger, S., Schiel, F., Rabold, S. (2006). Annotation of Multimodal Data. In: Wahlster, W. (eds) SmartKom: Foundations of Multimodal Dialogue Systems. Cognitive Technologies. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-36678-4_35

Download citation

DOI: https://doi.org/10.1007/3-540-36678-4_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23732-7
Online ISBN: 978-3-540-36678-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics