Skip to main content

Face-to-Face Interaction and the KTH Cooking Show

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

We share our experiences with integrating motion capture recordings in speech and dialogue research by describing (1) Spontal, a large project collecting 60 hours of video, audio and motion capture spontaneous dialogues, is described with special attention to motion capture and its pitfalls; (2) a tutorial where we use motion capture, speech synthesis and an animated talking head to allow students to create an active listener; and (3) brief preliminary results in the form of visualizations of motion capture data over time in a Spontal dialogue. We hope that given the lack of writings on the use of motion capture for speech research, these accounts will prove inspirational and informative.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson, A.: Towards human-like spoken dialogue systems. Speech Communication 50(8-9), 630–645 (2008)

    Article  Google Scholar 

  2. Cassell, J.: Body language: lessons from the near-human. In: Riskin, J. (ed.) Genesis Redux: Essays on the history and philosophy of artificial life, pp. 346–374. University of Chicago Press, Chicago (2007)

    Chapter  Google Scholar 

  3. Keller, E., Tschacher, W.: Prosodic and gestural expression of interactional agreement. In: Esposito, A., Faundez-Zauny, M., Keller, E., Marinaro, M. (eds.) Verbal and nonverbal communication behaviours, pp. 85–98. Springer, Berlin (2007)

    Chapter  Google Scholar 

  4. Edlund, J., Heldner, M., Hirschberg, J.: Pause and gap length in face-to-face interaction. In: Proc. of Interspeech 2009, Brighton, UK (2009)

    Google Scholar 

  5. Gustafson, J.: Developing multimodal spoken dialogue systems. Empirical studies of spoken human-computer interaction. Doctoral dissertation, KTH, Department of Speech, Music and Hearing, Stockholm (2002)

    Google Scholar 

  6. Beskow, J., Granström, B., House, D.: Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) Verbal and Nonverbal Communication Behaviours, pp. 250–263. Springer, Berlin (2007)

    Chapter  Google Scholar 

  7. Carlson, R., Granström, B.: Speech synthesis. In: Hardcastle, W.J., Laver, J. (eds.) The Handbook of Phonetic Science, pp. 768–788. Blackwell Publ. Ltd., Oxford (1997)

    Google Scholar 

  8. Beskow, J.: Rule-based visual speech synthesis. In: Pardo, J. (ed.) Proc of the 4th European Conference on Speech Communication and Technology (EUROSPEECH 1995), Madrid, pp. 299–302 (1995)

    Google Scholar 

  9. Sjölander, K., Beskow, J.: WaveSurfer - an open source speech tool. In: Yuan, B., Huang, T., Tang, X. (eds.) Proceedings of ICSLP 2000, 6th Intl Conf on Spoken Language Processing, pp. 464–467. China Military Friendship Publish, Beijing (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Beskow, J., Edlund, J., Granström, B., Gustafson, J., House, D. (2010). Face-to-Face Interaction and the KTH Cooking Show. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12397-9_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12396-2

  • Online ISBN: 978-3-642-12397-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics