research-article

Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality

Authors:
Lei Chen

Educational Testing Service, Princeton, NJ, USA

Educational Testing Service, Princeton, NJ, USA
View Profile

,
Chee Wee Leong

Educational Testing Service, Princeton, NJ, USA

Educational Testing Service, Princeton, NJ, USA
View Profile

,
Gary Feng

Educational Testing Service, Princeton, NJ, USA

Educational Testing Service, Princeton, NJ, USA
View Profile

,
Chong Min Lee

Educational Testing Service, Princeton, NJ, USA

Educational Testing Service, Princeton, NJ, USA
View Profile

MLA '14: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand ChallengeNovember 2014Pages 45–52https://doi.org/10.1145/2666633.2666640

Published:12 November 2014Publication History

MLA '14: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge

Pages 45–52

ABSTRACT

The ability of making presentation slides and delivering them effectively to convey information to the audience is a task of increasing importance, particularly in the pursuit of both academic and professional career success. We envision that multimodal sensing and machine learning techniques can be employed to evaluate, and potentially help to improve the quality of the content and delivery of public presentations. To this end, we report a study using the Oral Presentation Quality Corpus provided by the 2014 Multimodal Learning Analytics (MLA) Grand Challenge. A set of multimodal features were extracted from slides, speech, posture and hand gestures, as well as head poses. We also examined the dimensionality of the human scores, which could be concisely represented by two Principal Component (PC) scores, comp1 for delivery skills and comp2 for slides quality. Several machine learning experiments were performed to predict the two PC scores using multimodal features. Our experiments suggest that multimodal cues can predict human scores on presentation tasks, and a scoring model comprising both verbal and visual features can outperform that using just a single modality.

References

L. Batrinca, G. Stratou, A. Shapiro, L.-P. Morency, and S. Scherer. Cicero-towards a multimodal virtual audience platform for public speaking training. In Intelligent Virtual Agents, pages 116--128, 2013.Google ScholarCross Ref
J. Bernstein, A. V. Moere, and J. Cheng. Validating automated speaking tests. Language Testing, 27(3):355, 2010.Google ScholarCross Ref
P. Boersma. Praat, a system for doing phonetics by computer. Glot international, 5(9/10):341--345, 2002.Google Scholar
R. E. Carlson and D. Smith-Howell. Classroom public speaking assessment: Reliability and validity of selected evaluation instruments. Communication Education, 44(2):87--97, 1995.Google ScholarCross Ref
L. Chen, G. Feng, J. Joe, C. W. Leong, C. Kitchen, and C. M. Lee. Towards automated assessment of public speaking skills using multimodal cues. In Proceedings of the 16th international conference on multimodal interfaces. ACM, 2014. Google ScholarDigital Library
L. Chen, K. Zechner, and X. Xi. Improved pronunciation features for construct-driven assessment of non-native spontaneous speech. In NAACL-HLT, 2009. Google ScholarDigital Library
N. H. de Jong and T. Wempe. Praat script to detect syllable nuclei and measure speech rate automatically. Behavior research methods, 41(2):385--390, 2009.Google Scholar
ESPOL. Description of the oral presentation quality corpus. http://www.sigmla.org/datasets/, 2014.Google Scholar
H. Franco, H. Bratt, R. Rossier, V. R. Gadde, E. Shriberg, V. Abrash, and K. Precoda. EduSpeak: a speech recognition and pronunciation scoring toolkit for computer-aided language learning applications. Language Testing, 27(3):401, 2010.Google ScholarCross Ref
J. H. Friedman. Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4):367--378, 2002. Google ScholarDigital Library
R. Hincks. Measures and perceptions of liveliness in student oral presentation speech: A proposal for an automatic feedback mechanism. System, 33(4):575--591, 2005.Google ScholarCross Ref
J. B. Hirschberg and A. Rosenberg. Acoustic/prosodic and lexical correlates of charismatic speech. In Proc. of InterSpeech, 2005.Google Scholar
M. Kuhn. Building predictive models in r using the caret package. Journal of Statistical Software, 28(5):1--26, 2008.Google ScholarCross Ref
K. Kurihara, M. Goto, J. Ogata, Y. Matsusaka, and T. Igarashi. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the 9th international conference on Multimodal interfaces, pages 358--365. ACM, 2007. Google ScholarDigital Library
P. C. Kyllonen. Measurement of 21st century skills within the common core state standards. In Invitational Research Symposium on Technology Enhanced Assessments. May, pages 7--8, 2012.Google Scholar
L.-P. Morency, J. Whitehill, and J. Movellan. Monocular head pose estimation using generalized adaptive view-based appearance model. Image and Vision Computing, 28(5):754--761, 2010. Google ScholarDigital Library
A.-T. Nguyen, W. Chen, and M. Rauterberg. Online feedback system for public speakers. In IEEE Symp. e-Learning, e-Management and e-Services. Citeseer, 2012.Google ScholarCross Ref
C. B. Pull. Current status of knowledge on public-speaking anxiety. Current opinion in psychiatry, 25(1):32--38, 2012.Google Scholar
S. Scherer, G. Layher, J. Kane, H. Neumann, and N. Campbell. An audiovisual political speech analysis incorporating eye-tracking and perception data. In LREC, pages 1114--1120, 2012.Google Scholar
S. Scherer, G. Stratou, and L.-P. Morency. Audiovisual behavior descriptors for depression assessment. In Proceedings of the 15th ACM on International conference on multimodal interaction, pages 135--140. ACM, 2013. Google ScholarDigital Library
L. M. Schreiber, G. D. Paul, and L. R. Shibley. The development and test of the public speaking competence rubric. Communication Education, 61(3):205--233, 2012.Google ScholarCross Ref
D. Silverstein and T. Zhang. System and method of providing evaluation feedback to a speaker while giving a real-time oral presentation, Oct. 2003. U.S. Classification 715/730; International Classification G09B19/04; Cooperative Classification G09B19/04; European Classification G09B19/04.Google Scholar
A. J. Smola and B. Schölkopf. A tutorial on support vector regression. Statistics and computing, 14(3):199--222, 2004. Google ScholarDigital Library
M. Swift, G. Ferguson, L. Galescu, Y. Chu, C. Harman, H. Jung, I. Perera, Y. C. Song, J. Allen, and H. Kautz. A multimodal corpus for integrated language and action. In Proc. of the Int. Workshop on MultiModal Corpora for Machine Learning, 2012.Google Scholar
A. E. Ward. The assessment of public speaking: A pan-european view. In Information Technology Based Higher Education and Training (ITHET), 2013 International Conference on, pages 1--5. IEEE, 2013.Google ScholarCross Ref
S. M. Witt. Use of Speech Recognition in Computer-assisted Language Learning. PhD thesis, University of Cambridge, 1999.Google Scholar
Z. Zhang. Microsoft kinect sensor and its effect. Multimedia, IEEE, 19(2):4--10, 2012. Google ScholarDigital Library

Index Terms

Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality
1. Information systems
  1. Information systems applications
    1. Multimedia information systems

Recommendations

Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

Traditional assessments of public speaking skills rely on human scoring. We report an initial study on the development of an automated scoring model for public speaking performances using multimodal technologies. Task design, rubric development, and ...
Read More
MLA'14: Third Multimodal Learning Analytics Workshop and Grand Challenges
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

This paper summarizes the third Multimodal Learning Analytics Workshop and Grand Challenges (MLA'14). This subfield of Learning Analytics focuses on the interpretation of the multimodal interactions that occurs in learning environments, both digital and ...
Read More
Multimodal corpus of multiparty conversations in L1 and L2 languages and findings obtained from it

To investigate the differences in communicative activities by the same interlocutors in Japanese (their L1) and in English (their L2), an 8-h multimodal corpus of multiparty conversations was collected. Three subjects participated in each conversational ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MLA '14: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge
November 2014
68 pages
ISBN:9781450304887
DOI:10.1145/2666633
Program Chairs:
Xavier Ochoa
ESPOL, Ecuador
,
Marcelo Worsley
Stanford University, USA
,
Katherine Chiluiza
ESPOL, Ecuador
,
Saturnino Luz
Trinity College Dublin, Ireland
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
body tracking
educational applications
multimodal corpus
multimodal presentation assessment
public speaking
Qualifiers
- research-article
Conference

Acceptance Rates
MLA '14 Paper Acceptance Rate3of3submissions,100%Overall Acceptance Rate3of3submissions,100%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 434
  Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using Multimodal Cues to Analyze MLA'14 Oral Presentation Quality Corpus: Presentation Delivery and Slides Quality

MLA '14: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues

MLA'14: Third Multimodal Learning Analytics Workshop and Grand Challenges

Multimodal corpus of multiparty conversations in L1 and L2 languages and findings obtained from it