MuOE: A Multi-task Ordinality Aware Approach Towards Engagement Detection

Gandhi, Saumya; Fadia, Aayush; Agrawal, Ritik; Agrawal, Surbhi; Kumar, Praveen

doi:10.1007/978-3-031-45170-6_8

Saumya Gandhi¹²,
Aayush Fadia¹²,
Ritik Agrawal¹²,
Surbhi Agrawal¹² &
…
Praveen Kumar¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14301))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

585 Accesses

Abstract

With the increasing adoption of online learning, decreasing student engagement is becoming rampant. Detecting this is the first step in making online education more viable and effective. We present MuOE, a Multi-task Ordinality-aware Engagement detection model to identify attention levels from students’ webcam videos. MuOE uses a transformer with exceptional sequence-processing capability and a novel selector-based attention mechanism that picks important video frames. Facial cue detection is used as an auxillary task in our multi-task formulation of the problem, so the shared model base has more supervision. We leverage the ordinal nature of engagement levels by introducing a smooth loss function that penalizes predictions based on closeness to the true label. In this paper, we motivate each component of MuOE, and demonstrate its utility through a set of quantative experiments. We achieve a state-of-the-art accuracy of 57.65% (Top-2 accuracy 95.07%) on the DAiSEE dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We refrain from including Facial Action Units to avoid information leaking to the auxiliary task of predicting regressive action units.

References

Abedi, A., Khan, S.S.: Improving state-of-the-art in detecting student engagement with ResNet and TCN hybrid network. In: 2021 18th Conference on Robots and Vision (2021)
Google Scholar
Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L.P.: Openface 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 59–66. IEEE (2018)
Google Scholar
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: 25th ICML, pp. 160–167 (2008)
Google Scholar
Dhall, A., Sharma, G., Goecke, R., Gedeon, T.: Emotiw 2020: driver gaze, group emotion, student engagement and physiological signal based challenges. In: Proceedings of the 2020 International Conference on Multimodal Interaction (2020)
Google Scholar
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE CVPR (2015)
Google Scholar
Ekman, P., Friesen, W.V.: Facial action coding system. Environ. Psychol. Nonverbal Behav. (1978)
Google Scholar
Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: IEEE CVPR (2018)
Google Scholar
Gupta, A., Jaiswal, R., Adhikari, S., Balasubramanian, V.: DAISEE: dataset for affective states in e-learning environments. CoRR abs/1609.01885 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Khedher, A.B., Jraidi, I., Frasson, C., et al.: Tracking students’ mental engagement using EEG signals during an interaction with a virtual learning environment. J. Intell. Learn. Syst. Appl. 11(01), 1–14 (2019)
Google Scholar
Liao, J., Liang, Y., Pan, J.: Deep facial spatiotemporal network for engagement prediction in online learning. Appl. Intell. 51(10), 6609–6621 (2021)
Article Google Scholar
Mao, C., et al.: Multitask learning strengthens adversarial robustness. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 158–174. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_10
Chapter Google Scholar
Rothkrantz, L.: Dropout rates of regular courses and MOOCs. In: Costagliola, G., Uhomoibhi, J., Zvacek, S., McLaren, B.M. (eds.) CSEDU 2016. CCIS, vol. 739, pp. 25–46. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63184-4_3
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sun, B.Y., Li, J., Wu, D.D., Zhang, X.M., Li, W.B.: Kernel discriminant learning for ordinal regression. IEEE Trans. KDE 22(6), 906–910 (2009)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: IEEE ICCV (2015)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in NIPS, vol. 30 (2017)
Google Scholar
Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Chapter Google Scholar
Whitehill, J., Serpell, Z., Lin, Y.C., Foster, A., Movellan, J.R.: The faces of engagement: automatic recognition of student engagement from facial expressions. IEEE Trans. Affect. Comput. 5(1), 86–98 (2014)
Article Google Scholar
Zhang, H., Xiao, X., Huang, T., Liu, S., Xia, Y., Li, J.: An novel end-to-end network for automatic student engagement recognition. In: 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC) (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Visvesvaraya National Institute of Technology, Nagpur, India
Saumya Gandhi, Aayush Fadia, Ritik Agrawal, Surbhi Agrawal & Praveen Kumar

Authors

Saumya Gandhi
View author publications
You can also search for this author in PubMed Google Scholar
Aayush Fadia
View author publications
You can also search for this author in PubMed Google Scholar
Ritik Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Surbhi Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Praveen Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saumya Gandhi .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
Pradipta Maji
Texas A&M University at Qatar, Doha, Qatar
Tingwen Huang
Indian Statistical Institute, Kolkata, West Bengal, India
Nikhil R. Pal
Indian Institute of Technology Jodhpur, Jodhpur, India
Santanu Chaudhury
Indian Statistical Institute, Kolkata, West Bengal, India
Rajat K. De

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gandhi, S., Fadia, A., Agrawal, R., Agrawal, S., Kumar, P. (2023). MuOE: A Multi-task Ordinality Aware Approach Towards Engagement Detection. In: Maji, P., Huang, T., Pal, N.R., Chaudhury, S., De, R.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2023. Lecture Notes in Computer Science, vol 14301. Springer, Cham. https://doi.org/10.1007/978-3-031-45170-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-45170-6_8
Published: 04 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45169-0
Online ISBN: 978-3-031-45170-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics