Measuring and Integrating Facial Expressions and Head Pose as Indicators of Engagement and Affect in Tutoring Systems

Yu, Hao; Gupta, Ankit; Lee, Will; Arroyo, Ivon; Betke, Margrit; Allesio, Danielle; Murray, Tom; Magee, John; Woolf, Beverly P.

doi:10.1007/978-3-030-77873-6_16

Hao Yu¹⁰,
Ankit Gupta¹¹,
Will Lee¹²,
Ivon Arroyo¹²,
Margrit Betke¹⁰,
Danielle Allesio¹²,
Tom Murray¹²,
John Magee¹³ &
…
Beverly P. Woolf¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12793))

Included in the following conference series:

International Conference on Human-Computer Interaction

912 Accesses
6 Citations

Abstract

While using online learning software, students demonstrate many reactions, various levels of engagement, and emotions (e.g. confusion, boredom, excitement). Having such information automatically accessible to teachers (or digital tutors) can aid in understanding how students are progressing, and suggest who and when needs further assistance. As part of this work, we conducted two studies using computer vision techniques to measure students’ engagement and affective states from their head pose and facial expressions, as they use an online tutoring system, MathSpring.org, designed to aid students’ practice of mathematics problem-solving. We present a Head Pose Tutor, which estimates the real-time head direction of students and responds to potential disengagement, and a Facial Expression-Augmented Teacher Dashboard, that identifies students’ affective states and provides this information to teachers. We collected video data of undergraduate students interacting with MathSpring. Preliminary results on MathSpring videos were encouraging indicating accuracy in detecting head orientation. A usability study was conducted with actual teachers to start to evaluate the possible impact of the proposed Teacher Dashboard software.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arroyo, I., Woolf, B.P., Burelson, W., Muldner, K., Rai, D., Tai, M.: A multimedia adaptive tutoring system for mathematics that addresses cognition, metacognition and affect. Int. J. Artif. Intell. Educ. 24(4), 387–426 (2014)
Article Google Scholar
Baker, R.S., D’Mello, S.K., Rodrigo, M.M.T., Graesser, A.C.: Better to be frustrated than bored: The incidence, persistence, and impact of learners’ cognitive-affective states during interactions with three different computer-based learning environments. Int. J. Hum.-Comput. Stud. 68(4), 223–241 (2010)
Article Google Scholar
Bosch, N., D’mello, S.K., Ocumpaugh, J., Baker, R.S., Shute, V.: Using video to automatically detect learner affect in computer-enabled classrooms. ACM Trans. Inter. Intell. Syst. (TiiS) 6(2), 1–26 (2016)
Article Google Scholar
Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3D facial landmarks). In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1021–1030 (2017)
Google Scholar
Chang, F.J., Tuan Tran, A., Hassner, T., Masi, I., Nevatia, R., Medioni, G.: Faceposenet: making a case for landmark-free face alignment. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1599–1608 (2017)
Google Scholar
Corrigan, S., Barkley, T., Pardos, Z.: Dynamic approaches to modeling student affect and its changing role in learning and performance. In: Ricci, F., Bontcheva, K., Conlan, O., Lawless, S. (eds.) UMAP 2015. LNCS, vol. 9146, pp. 92–103. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20267-9_8
Chapter Google Scholar
D’Mello, S., Dieterle, E., Duckworth, A.: Advanced, analytic, automated (AAA) measurement of engagement during learning. Educ. Psychol. 52(2), 104–123 (2017)
Article Google Scholar
D’Mello, S., Olney, A., Williams, C., Hays, P.: Gaze tutor: a gaze-reactive intelligent tutoring system. Int. J. Hum.-Comput. Stud. 70(5), 377–398 (2012)
Article Google Scholar
D’Mello, S., Lehman, B., Pekrun, R., Graesser, A.: Confusion can be beneficial for learning. Learn. Instr. 29, 153–170 (2014)
Article Google Scholar
D’Mello, S.K.: Gaze-based attention-aware cyberlearning technologies. In: Parsons, T.D., Lin, L., Cockerham, D. (eds.) Mind, Brain and Technology. ECTII, pp. 87–105. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-02631-8_6
Chapter Google Scholar
Ekman, P., Friesen, W.V., Hager, J.C.: Facial action coding system. Research Nexus, Salt Lake City (2002)
Google Scholar
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124 (1971)
Article Google Scholar
Fanelli, G., Weise, T., Gall, J., Van Gool, L.: Real time head pose estimation from consumer depth cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23123-0_11
Chapter Google Scholar
Gou, C., Wu, Y., Wang, F.Y., Ji, Q.: Coupled cascade regression for simultaneous facial landmark detection and head pose estimation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2906–2910. IEEE (2017)
Google Scholar
Grafsgaard, J.F., Wiggins, J.B., Vail, A.K., Boyer, K.E., Wiebe, E.N., Lester, J.C.: The additive value of multimodal features for predicting engagement, frustration, and learning during tutoring. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 42–49 (2014)
Google Scholar
Hoffman, J.E., Subramaniam, B.: The role of visual attention in saccadic eye movements. Percept. Psychophysics. 57(6), 787–795 (1995)
Article Google Scholar
Hu, Y., Chen, L., Zhou, Y., Zhang, H.: Estimating face pose by facial asymmetry and geometry. In: Proceedings of Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004, pp. 651–656. IEEE (2004)
Google Scholar
Hutt, S., Mills, C., Bosch, N., Krasich, K., Brockmole, J., D’mello, S.: Out of the fr-eye-ing pan towards gaze-based models of attention during learning with technology in the classroom. In: Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization, pp. 94–103 (2017)
Google Scholar
Khan, A.Z., Blohm, G., McPeek, R.M., Lefevre, P.: Differential influence of attention on gaze and head movements. J. Neurophysiol. 101(1), 198–206 (2009)
Article Google Scholar
Khorrami, P., Paine, T., Huang, T.: Do deep neural networks learn facial action units when doing expression recognition? In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 19–27 (2015)
Google Scholar
Kumar, A., Alavi, A., Chellappa, R.: Kepler: keypoint and pose estimation of unconstrained faces by learning efficient H-CNN regressors. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 258–265. IEEE (2017)
Google Scholar
Martins, P., Batista, J.: Accurate single view model-based head pose estimation. In: 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 1–6. IEEE (2008)
Google Scholar
Meng, Z., Liu, P., Cai, J., Han, S., Tong, Y.: Identity-aware convolutional neural network for facial expression recognition. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 558–565. IEEE (2017)
Google Scholar
Mukherjee, S.S., Robertson, N.M.: Deep head pose: Gaze-direction estimation in multimodal video. IEEE Trans. Multimedia. 17(11), 2094–2107 (2015)
Article Google Scholar
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2017)
Article Google Scholar
Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124(3), 372 (1998)
Article Google Scholar
Ruiz, N., Chong, E., Rehg, J.M.: Fine-grained head pose estimation without keypoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2074–2083 (2018)
Google Scholar
Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: A comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)
Article Google Scholar
Sharma, K., Alavi, H.S., Jermann, P., Dillenbourg, P.: A gaze-based learning analytics model: in-video visual feedback to improve learner’s attention in MOOCs. In: Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, pp. 417–421 (2016)
Google Scholar
Whitehill, J., Serpell, Z., Lin, Y.C., Foster, A., Movellan, J.R.: The faces of engagement: automatic recognition of student engagement from facial expressions. IEEE Trans. Affect. Comput. 5(1), 86–98 (2014)
Article Google Scholar
Wixon, M., Arroyo, I.: When the question is part of the answer: examining the impact of emotion self-reports on student emotion. In: Dimitrova, V., Kuflik, T., Chin, D., Ricci, F., Dolog, P., Houben, G.-J. (eds.) UMAP 2014. LNCS, vol. 8538, pp. 471–477. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08786-3_42
Chapter Google Scholar
Woolf, B., Burleson, W., Arroyo, I., Dragon, T., Cooper, D., Picard, R.: Affect-aware tutors: recognising and responding to student affect. Int. J. Learn. Technol. 4(3–4), 129–164 (2009)
Article Google Scholar
Yang, T.Y., Chen, Y.T., Lin, Y.Y., Chuang, Y.Y.: FSA-net: learning fine-grained structure aggregation for head pose estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1087–1096 (2019)
Google Scholar
Zatarain-Cabada, R., Barrón-Estrada, M.L., Camacho, J.L.O., Reyes-García, C.A.: Affective tutoring system for android mobiles. In: Huang, D.-S., Jo, K.-H., Wang, L. (eds.) ICIC 2014. LNCS (LNAI), vol. 8589, pp. 1–10. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09339-0_1
Chapter Google Scholar
Zhang, F., Zhang, T., Mao, Q., Xu, C.: Joint pose and expression modeling for facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3359–3368 (2018)
Google Scholar
Zhi, R., Flierl, M., Ruan, Q., Kleijn, W.B.: Graph-preserving sparse nonnegative matrix factorization with application to facial expression recognition. IEEE Trans. Syst. Man Cybern. B Cybern. 41(1), 38–52 (2010)
Google Scholar
Zhong, L., Liu, Q., Yang, P., Liu, B., Huang, J., Metaxas, D.N.: Learning active facial patches for expression analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2562–2569. IEEE (2012)
Google Scholar
Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 146–155 (2016)
Google Scholar
Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3D total solution. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 78–92 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Boston University, Boston, MA, 02215, USA
Hao Yu & Margrit Betke
Worcester Polytechnic Institute, Worcester, MA, 01609, USA
Ankit Gupta
University of Massachusetts-Amherst, Amherst, MA, 01003, USA
Will Lee, Ivon Arroyo, Danielle Allesio, Tom Murray & Beverly P. Woolf
Clark University, Worcester, MA, 01610, USA
John Magee

Authors

Hao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Ankit Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Will Lee
View author publications
You can also search for this author in PubMed Google Scholar
Ivon Arroyo
View author publications
You can also search for this author in PubMed Google Scholar
Margrit Betke
View author publications
You can also search for this author in PubMed Google Scholar
Danielle Allesio
View author publications
You can also search for this author in PubMed Google Scholar
Tom Murray
View author publications
You can also search for this author in PubMed Google Scholar
John Magee
View author publications
You can also search for this author in PubMed Google Scholar
Beverly P. Woolf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivon Arroyo .

Editor information

Editors and Affiliations

Soar Technology, Inc., Orlando, FL, USA
Robert A. Sottilare
Fraunhofer FKIE, Wachtberg, Germany
Jessica Schwarz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, H. et al. (2021). Measuring and Integrating Facial Expressions and Head Pose as Indicators of Engagement and Affect in Tutoring Systems. In: Sottilare, R.A., Schwarz, J. (eds) Adaptive Instructional Systems. Adaptation Strategies and Methods. HCII 2021. Lecture Notes in Computer Science(), vol 12793. Springer, Cham. https://doi.org/10.1007/978-3-030-77873-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-77873-6_16
Published: 03 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77872-9
Online ISBN: 978-3-030-77873-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics