Abstract
Public speaking performances are not only characterized by the presentation of the content, but also by the presenters’ nonverbal behavior, such as gestures, tone of voice, vocal variety, and facial expressions. Within this work, we seek to identify automatic nonverbal behavior descriptors that correlate with expert-assessments of behaviors characteristic of good and bad public speaking performances. We present a novel multimodal corpus recorded with a virtual audience public speaking training platform. Lastly, we utilize the behavior descriptors to automatically approximate the overall assessment of the performance using support vector regression in a speaker-independent experiment and yield promising results approaching human performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boersma, P.: Praat, a system for doing phonetics by computer. Glot International 5(9), 341–345 (2001)
Drugman, T., Abeer, A.: Joint robust voicing detection and pitch estimation based on residual harmonics. In: Proceedings of Interspeech 2011, pp. 1973–1976. ISCA (2011)
Harris, S.R., Kemmerling, R.L., North, M.M.: Brief virtual reality therapy for public speaking anxiety. Cyberpsychology and Behavior 5, 543–550 (2002)
Hofmann, S.G., DiBartolo, P.M.: An instrument to assess self-statements during public speaking: Scale development and preliminary psychometric properties. Journal of Behavior Therapy, 499–515 (2000)
Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-23, 67–72 (1975)
Kane, J., Scherer, S., Morency, L.-P., Gobl, C.: A comparative study of glottal open quotient estimation techniques. To Appear in Proceedings of Interspeech 2013. ISCA (2013)
Kenny, P., Hartholt, A., Gratch, J., Swartout, W., Traum, D., Marsella, S., Piepol, D.: Building interactive virtual humans for training environments. In: Proceedings of I/ITSEC (2007)
Koppensteiner, M., Grammer, K.: Motion patterns in political speech and their influence on personality ratings. Journal of Research in Personality 44, 374–379 (2010)
McCroskey, J.C.: Measures of communication-bound anxiety. Speech Monographs 37, 269–277 (1970)
North, M.M., North, S.M., Coble, J.R.: Virtual reality therapy: An effective treatment for the fear of public speaking. International Journal of Virtual Reality 3, 2–6 (1998)
Pertaub, D.P., Slater, M., Barker, C.: An experiment on public speaking anxiety in response to three different types of virtual audience. Presence: Teleoperators and Virtual Environments 11, 68–78 (2002)
Rammstedt, B., John, O.P.: Measuring personality in one minute or less: A 10-item short version of the big five inventory in English and German. Journal of Research in Personality 41, 203–212 (2007)
Scherer, S., Layher, G., Kane, J., Neumann, H., Campbell, N.: An audiovisual political speech analysis incorporating eye-tracking and perception data. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), pp. 1114–1120. ELRA (2012)
Scherer, S., Marsella, S., Stratou, G., Xu, Y., Morbini, F., Egan, A., Rizzo, A(S.), Morency, L.-P.: Perception markup language: Towards a standardized representation of perceived nonverbal behaviors. In: Nakano, Y., Neff, M., Paiva, A., Walker, M. (eds.) IVA 2012. LNCS (LNAI), vol. 7502, pp. 455–463. Springer, Heidelberg (2012)
Scherer, S., Stratou, G., Mahmoud, M., Boberg, J., Gratch, J., Rizzo, A., Morency, L.-P.: Automatic behavior descriptors for psychological disorder analysis. In: Proceedings of IEEE Conference on Automatic Face and Gesture Recognition. IEEE (2013)
Shapiro, A.: Building a character animation system. In: Allbeck, J.M., Faloutsos, P. (eds.) MIG 2011. LNCS, vol. 7060, pp. 98–109. Springer, Heidelberg (2011)
Strangert, E., Gustafson, J.: What makes a good speaker? Subject ratings, acoustic measurements and perceptual evaluations. In: Proceedings of Interspeech 2008, pp. 1688–1691. ISCA (2008)
Talkin, D.: A Robust Algorithm for Pitch Tracking. In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech Coding and Synthesis, pp. 495–517. Elsevier (1995)
Thompson, E.R.: Development and validation of an internationally reliable short-form of the positive and negative affect schedule (panas). Journal of Cross-Cultural Psychology 38(2), 227–242 (2007)
Wagner, J., Lingenfelser, F., Bee, N., André, E.: Social signal interpretation (ssi). In: KI - Kuenstliche Intelligenz, vol. 25, pp. 251–256 (2011), doi:10.1007/s13218-011-0115-x
Witmer, B.G., Singer, M.J.: Measuring presence in virtual environments: A presence questionnaire. Presence 7(3), 225–240 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Batrinca, L., Stratou, G., Shapiro, A., Morency, LP., Scherer, S. (2013). Cicero - Towards a Multimodal Virtual Audience Platform for Public Speaking Training. In: Aylett, R., Krenn, B., Pelachaud, C., Shimodaira, H. (eds) Intelligent Virtual Agents. IVA 2013. Lecture Notes in Computer Science(), vol 8108. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40415-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-40415-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40414-6
Online ISBN: 978-3-642-40415-3
eBook Packages: Computer ScienceComputer Science (R0)