Face-to-Face Interaction and the KTH Cooking Show

Beskow, Jonas; Edlund, Jens; Granström, Björn; Gustafson, Joakim; House, David

doi:10.1007/978-3-642-12397-9_13

Face-to-Face Interaction and the KTH Cooking Show

Jonas Beskow²⁰,
Jens Edlund²⁰,
Björn Granström²⁰,
Joakim Gustafson²⁰ &
…
David House²⁰

Chapter

2297 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

We share our experiences with integrating motion capture recordings in speech and dialogue research by describing (1) Spontal, a large project collecting 60 hours of video, audio and motion capture spontaneous dialogues, is described with special attention to motion capture and its pitfalls; (2) a tutorial where we use motion capture, speech synthesis and an animated talking head to allow students to create an active listener; and (3) brief preliminary results in the form of visualizations of motion capture data over time in a Spontal dialogue. We hope that given the lack of writings on the use of motion capture for speech research, these accounts will prove inspirational and informative.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson, A.: Towards human-like spoken dialogue systems. Speech Communication 50(8-9), 630–645 (2008)
Article Google Scholar
Cassell, J.: Body language: lessons from the near-human. In: Riskin, J. (ed.) Genesis Redux: Essays on the history and philosophy of artificial life, pp. 346–374. University of Chicago Press, Chicago (2007)
Chapter Google Scholar
Keller, E., Tschacher, W.: Prosodic and gestural expression of interactional agreement. In: Esposito, A., Faundez-Zauny, M., Keller, E., Marinaro, M. (eds.) Verbal and nonverbal communication behaviours, pp. 85–98. Springer, Berlin (2007)
Chapter Google Scholar
Edlund, J., Heldner, M., Hirschberg, J.: Pause and gap length in face-to-face interaction. In: Proc. of Interspeech 2009, Brighton, UK (2009)
Google Scholar
Gustafson, J.: Developing multimodal spoken dialogue systems. Empirical studies of spoken human-computer interaction. Doctoral dissertation, KTH, Department of Speech, Music and Hearing, Stockholm (2002)
Google Scholar
Beskow, J., Granström, B., House, D.: Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) Verbal and Nonverbal Communication Behaviours, pp. 250–263. Springer, Berlin (2007)
Chapter Google Scholar
Carlson, R., Granström, B.: Speech synthesis. In: Hardcastle, W.J., Laver, J. (eds.) The Handbook of Phonetic Science, pp. 768–788. Blackwell Publ. Ltd., Oxford (1997)
Google Scholar
Beskow, J.: Rule-based visual speech synthesis. In: Pardo, J. (ed.) Proc of the 4th European Conference on Speech Communication and Technology (EUROSPEECH 1995), Madrid, pp. 299–302 (1995)
Google Scholar
Sjölander, K., Beskow, J.: WaveSurfer - an open source speech tool. In: Yuan, B., Huang, T., Tang, X. (eds.) Proceedings of ICSLP 2000, 6th Intl Conf on Spoken Language Processing, pp. 464–467. China Military Friendship Publish, Beijing (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

KTH Speech Music and Hearing/Centre for Speech Technology, Lindstedtsvägen 24, SE-100 44, Stockholm, Sweden
Jonas Beskow, Jens Edlund, Björn Granström, Joakim Gustafson & David House

Authors

Jonas Beskow
View author publications
You can also search for this author in PubMed Google Scholar
Jens Edlund
View author publications
You can also search for this author in PubMed Google Scholar
Björn Granström
View author publications
You can also search for this author in PubMed Google Scholar
Joakim Gustafson
View author publications
You can also search for this author in PubMed Google Scholar
David House
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Second University of Naples, and IIASS, Via Pellegrino, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Centre for Language and Communication Studies, Trinity College, The University of Dublin, Dublin 2, Ireland
Nick Campbell & Carl Vogel &
Department of Computing Science & Mathematics, University of Stirling, FK9 4LA, Stirling, Scotland, UK
Amir Hussain
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Beskow, J., Edlund, J., Granström, B., Gustafson, J., House, D. (2010). Face-to-Face Interaction and the KTH Cooking Show. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-12397-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics