skip to main content
10.1145/2666633.2666640acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality

Authors Info & Claims
Published:12 November 2014Publication History

ABSTRACT

The ability of making presentation slides and delivering them effectively to convey information to the audience is a task of increasing importance, particularly in the pursuit of both academic and professional career success. We envision that multimodal sensing and machine learning techniques can be employed to evaluate, and potentially help to improve the quality of the content and delivery of public presentations. To this end, we report a study using the Oral Presentation Quality Corpus provided by the 2014 Multimodal Learning Analytics (MLA) Grand Challenge. A set of multimodal features were extracted from slides, speech, posture and hand gestures, as well as head poses. We also examined the dimensionality of the human scores, which could be concisely represented by two Principal Component (PC) scores, comp1 for delivery skills and comp2 for slides quality. Several machine learning experiments were performed to predict the two PC scores using multimodal features. Our experiments suggest that multimodal cues can predict human scores on presentation tasks, and a scoring model comprising both verbal and visual features can outperform that using just a single modality.

References

  1. L. Batrinca, G. Stratou, A. Shapiro, L.-P. Morency, and S. Scherer. Cicero-towards a multimodal virtual audience platform for public speaking training. In Intelligent Virtual Agents, pages 116--128, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. Bernstein, A. V. Moere, and J. Cheng. Validating automated speaking tests. Language Testing, 27(3):355, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  3. P. Boersma. Praat, a system for doing phonetics by computer. Glot international, 5(9/10):341--345, 2002.Google ScholarGoogle Scholar
  4. R. E. Carlson and D. Smith-Howell. Classroom public speaking assessment: Reliability and validity of selected evaluation instruments. Communication Education, 44(2):87--97, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  5. L. Chen, G. Feng, J. Joe, C. W. Leong, C. Kitchen, and C. M. Lee. Towards automated assessment of public speaking skills using multimodal cues. In Proceedings of the 16th international conference on multimodal interfaces. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Chen, K. Zechner, and X. Xi. Improved pronunciation features for construct-driven assessment of non-native spontaneous speech. In NAACL-HLT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. H. de Jong and T. Wempe. Praat script to detect syllable nuclei and measure speech rate automatically. Behavior research methods, 41(2):385--390, 2009.Google ScholarGoogle Scholar
  8. ESPOL. Description of the oral presentation quality corpus. http://www.sigmla.org/datasets/, 2014.Google ScholarGoogle Scholar
  9. H. Franco, H. Bratt, R. Rossier, V. R. Gadde, E. Shriberg, V. Abrash, and K. Precoda. EduSpeak: a speech recognition and pronunciation scoring toolkit for computer-aided language learning applications. Language Testing, 27(3):401, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. H. Friedman. Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4):367--378, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Hincks. Measures and perceptions of liveliness in student oral presentation speech: A proposal for an automatic feedback mechanism. System, 33(4):575--591, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  12. J. B. Hirschberg and A. Rosenberg. Acoustic/prosodic and lexical correlates of charismatic speech. In Proc. of InterSpeech, 2005.Google ScholarGoogle Scholar
  13. M. Kuhn. Building predictive models in r using the caret package. Journal of Statistical Software, 28(5):1--26, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. Kurihara, M. Goto, J. Ogata, Y. Matsusaka, and T. Igarashi. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the 9th international conference on Multimodal interfaces, pages 358--365. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. C. Kyllonen. Measurement of 21st century skills within the common core state standards. In Invitational Research Symposium on Technology Enhanced Assessments. May, pages 7--8, 2012.Google ScholarGoogle Scholar
  16. L.-P. Morency, J. Whitehill, and J. Movellan. Monocular head pose estimation using generalized adaptive view-based appearance model. Image and Vision Computing, 28(5):754--761, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A.-T. Nguyen, W. Chen, and M. Rauterberg. Online feedback system for public speakers. In IEEE Symp. e-Learning, e-Management and e-Services. Citeseer, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  18. C. B. Pull. Current status of knowledge on public-speaking anxiety. Current opinion in psychiatry, 25(1):32--38, 2012.Google ScholarGoogle Scholar
  19. S. Scherer, G. Layher, J. Kane, H. Neumann, and N. Campbell. An audiovisual political speech analysis incorporating eye-tracking and perception data. In LREC, pages 1114--1120, 2012.Google ScholarGoogle Scholar
  20. S. Scherer, G. Stratou, and L.-P. Morency. Audiovisual behavior descriptors for depression assessment. In Proceedings of the 15th ACM on International conference on multimodal interaction, pages 135--140. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. M. Schreiber, G. D. Paul, and L. R. Shibley. The development and test of the public speaking competence rubric. Communication Education, 61(3):205--233, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  22. D. Silverstein and T. Zhang. System and method of providing evaluation feedback to a speaker while giving a real-time oral presentation, Oct. 2003. U.S. Classification 715/730; International Classification G09B19/04; Cooperative Classification G09B19/04; European Classification G09B19/04.Google ScholarGoogle Scholar
  23. A. J. Smola and B. Schölkopf. A tutorial on support vector regression. Statistics and computing, 14(3):199--222, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Swift, G. Ferguson, L. Galescu, Y. Chu, C. Harman, H. Jung, I. Perera, Y. C. Song, J. Allen, and H. Kautz. A multimodal corpus for integrated language and action. In Proc. of the Int. Workshop on MultiModal Corpora for Machine Learning, 2012.Google ScholarGoogle Scholar
  25. A. E. Ward. The assessment of public speaking: A pan-european view. In Information Technology Based Higher Education and Training (ITHET), 2013 International Conference on, pages 1--5. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  26. S. M. Witt. Use of Speech Recognition in Computer-assisted Language Learning. PhD thesis, University of Cambridge, 1999.Google ScholarGoogle Scholar
  27. Z. Zhang. Microsoft kinect sensor and its effect. Multimedia, IEEE, 19(2):4--10, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MLA '14: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge
      November 2014
      68 pages
      ISBN:9781450304887
      DOI:10.1145/2666633

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 November 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MLA '14 Paper Acceptance Rate3of3submissions,100%Overall Acceptance Rate3of3submissions,100%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader