Abstract
Classroom teachers utilize many nonverbal activities, such as gesturing and walking, to maintain student attention. Quantifying instructor behaviors in a live classroom environment has traditionally been done through manual coding, a prohibitively time-consuming process which precludes providing timely, fine-grained feedback to instructors. Here we propose an automated method for assessing teachers’ non-verbal behaviors using video-based motion estimation tailored for classroom applications. Motion was estimated by subtracting background pixels that varied little from their mean values, and then noise was reduced using filters designed specifically with the movements and speeds of teachers in mind. Camera pan and zoom events were also detected, using a method based on tracking the correlations between moving points in the video. Results indicated the motion estimation method was effective for predicting instructors’ non-verbal behaviors, including gestures (kappa = .298), walking (kappa = .338), and camera pan (an indicator of instructor movement; kappa = .468), all of which are plausibly related to student attention. We also found evidence of predictive validity, as these automated predictions of instructor behaviors were correlated with students’ mean self-reported level of attention (e.g., r = .346 for walking), indicating that the proposed method captures the association between instructors’ non-verbal behaviors and student attention. We discuss the potential for providing timely, fine-grained, automated feedback to teachers, as well as opportunities for future classroom studies using this method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We have made the code for motion, camera pan, and zoom estimation available online at https://github.com/pnb/classroom-motion.
- 2.
We experimented with a range of segment lengths from 10 to 60 s, finding that 30–60 s segments provided equivalent results and were consistently better than 10 or 20 s. We thus segmented at 30 s intervals to provide the finest granularity from the 30–60 s range.
- 3.
Changing the segment length does not have a dramatic effect on results. Longer segment lengths (e.g., 700 s) produce slightly stronger correlations, but we report our original segment length in this paper (500 s) to avoid overfitting the analysis to desirable results.
References
Ambady, N., Rosenthal, R.: Half a minute: predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. J. Pers. Soc. Psychol. 64, 431–441 (1993)
Witt, P.L., Wheeless, L.R., Allen, M.: A meta-analytical review of the relationship between teacher immediacy and student learning. Commun. Monogr. 71, 184–207 (2004)
Babad, E., Avni-Babad, D., Rosenthal, R.: Teachers’ brief nonverbal behaviors in defined instructional situations can predict students’ evaluations. J. Educ. Psychol. 95, 553–562 (2003)
Pogue, L.L., Ahyun, K.: The effect of teacher nonverbal immediacy and credibility on student motivation and affective learning. Commun. Educ. 55, 331–344 (2006)
Andersen, J.F., Withrow, J.G.: The impact of lecturer nonverbal expressiveness on improving mediated instruction. Commun. Educ. 30, 342–353 (1981)
Allen, J.L., Shaw, D.H.: Teachers’ communication behaviors and supervisors’ evaluation of instruction in elementary and secondary classrooms. Commun. Educ. 39, 308–322 (1990)
Menzel, K.E., Carrell, L.J.: The impact of gender and immediacy on willingness to talk and perceived learning. Commun. Educ. 48, 31–40 (1999)
Alseid, M., Rigas, D.: Empirical results for the use of facial expressions and body gestures in e-learning tools. Int. J. Comput. Commun. 2, 87–94 (2008)
Schneider, J., Börner, D., van Rosmalen, P., Specht, M.: Presentation Trainer, your public speaking multimodal coach. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 539–546. ACM, New York (2015)
Barmaki, R., Hughes, C.E.: Providing real-time feedback for student teachers in a virtual rehearsal environment. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 531–537. ACM, New York (2015)
Schneider, J., Börner, D., van Rosmalen, P., Specht, M.: Enhancing public speaking skills - an evaluation of the presentation trainer in the wild. In: Verbert, K., Sharples, M., Klobučar, T. (eds.) EC-TEL 2016. LNCS, vol. 9891, pp. 263–276. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45153-4_20
Ramanan, D., Forsyth, D.A.: Finding and tracking people from the bottom up. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 467–474. IEEE (2003)
Bouguet, J.-Y.: Pyramidal implementation of the Lucas Kanade feature tracker. Intel Corporation, Microprocessor Research Labs (1999)
Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27, 773–780 (2006)
Yun, X., Bachmann, E.R.: Design, implementation, and experimental results of a quaternion-based Kalman filter for human body motion tracking. IEEE Trans. Rob. 22, 1216–1227 (2006)
Stoll, C., Hasler, N., Gall, J., Seidel, H.P., Theobalt, C.: Fast articulated motion tracking using a sums of Gaussians body model. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2011), pp. 951–958 (2011)
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43, 1–54 (2015)
Westlund, J.K., D’Mello, S.K., Olney, A.M.: Motion tracker: camera-based monitoring of bodily movements using motion silhouettes. PLoS One 10 (2015)
Maragos, P.: Tutorial on advances in morphological image processing and analysis. Opt. Eng. 26, 267623 (1987)
Shi, J., Tomasi, C.: Good features to track. In: Proceedings of 1994 IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (1994)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Bosch, N., D’Mello, S.K., Ocumpaugh, J., Baker, R.S., Shute, V.: Using video to automatically detect learner affect in computer-enabled classrooms. ACM Trans. Interact. Intell. Syst. (TiiS) 6, 17 (2016)
Paquette, L., de Carvahlo, A., Baker, R., Ocumpaugh, J.: Reengineering the feature distillation process: a case study in detection of gaming the system. In: Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014), pp. 284–287. Educational Data Mining Society (2014)
Jeni, L.A., Cohn, J.F., De la Torre, F.: Facing imbalanced data–recommendations for the use of performance metrics. In: Proceedings of the 5th International Conference on Affective Computing and Intelligent Interaction, pp. 245–251 (2013)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995)
Blanchard, N., Donnelly, P.J., Olney, A.M., Samei, B., Ward, B., Sun, X., Kelly, S., Nystrand, M., D’Mello, S.K.: Identifying teacher questions using automatic speech recognition in classrooms. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pp. 191–201. Association for Computational Linguistics (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Bosch, N., Mills, C., Wammes, J.D., Smilek, D. (2018). Quantifying Classroom Instructor Dynamics with Computer Vision. In: Penstein Rosé, C., et al. Artificial Intelligence in Education. AIED 2018. Lecture Notes in Computer Science(), vol 10947. Springer, Cham. https://doi.org/10.1007/978-3-319-93843-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-93843-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93842-4
Online ISBN: 978-3-319-93843-1
eBook Packages: Computer ScienceComputer Science (R0)