Student behavior recognition based on multitask learning

Mo, Jianwen; Zhu, Rui; Yuan, Hua; Shou, Zhaoyu; Chen, Lingping

doi:10.1007/s11042-022-14100-7

Student behavior recognition based on multitask learning

Track 5: Multimedia and Education
Published: 25 November 2022

Volume 82, pages 19091–19108, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jianwen Mo¹,
Rui Zhu¹,
Hua Yuan¹,
Zhaoyu Shou¹ &
…
Lingping Chen²

482 Accesses
5 Citations
Explore all metrics

Abstract

The assessment of students’ classroom behavior is an important part of classroom teaching evaluation. However, teachers cannot timely, objectively and accurately evaluate the listening status of each student in the class. We offer a multitask classroom behavior recognition method that combines human pose estimation and object detection. First, the target detector extracts the individual region from the keyframe as the network’s input. Then, the multitask heatmap network (MTHN) module extracts the intermediate heatmap of multiscale feature association. The attitude estimation and target detection tasks are constructed by mapping relations to obtain the keypoints and object position information. Finally, the keypoints behavior vector and the metric vector are used to model the behavior, and a classroom behavior detection algorithm based on the fully connected network is designed. Additionally, we created a classroom dataset with pose estimation, objects, and behavior labels. Meanwhile, transfer learning is used to solve the problem of insufficient sample size. After several experiments, we show that the detection accuracy of the proposed multitask learning-based student behavior recognition algorithm reaches more than 90%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934
Cai Z, Vasconcelos N (2018) Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Cao Z, Simon T, Wei SE et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291– 7299
Carreira J, Zisserman A (2017) Quo vadis, action recognition a new model and the kinetics dataset. In: proceedings of the IEEE conference on computer vision and pattern recognition. pp 6299–6308
Chen Y, Wang Z, Peng Y et al (2018) Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7103– 7112
Cheng B, Xiao B, Wang J et al (2020) Higherhrnet: scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5386–5395
COCO: COCO Leader Board. http://cocodataset.org. Accessed 14 June 2021
Feichtenhofer C (2020) X3d: expanding architectures for efficient video recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 203–213
Feichtenhofer C, Fan H, Malik J et al (2019) Slowfast networks for video recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6202–6211
Fu R, Wu T, Luo Z et al (2019) Learning behavior analysis in classroom based on deep learning. In: 2019 tenth international conference on intelligent control and information processing (ICICIP). IEEE, pp 206–212
Ge Z, Liu S, Wang F et al (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430
Huang W, Li N, Qiu Z et al (2020) An automatic recognition method for students’ classroom behaviors based on image processing. Traitement du Signal 37(3)
Kaiming H, Gkioxari G, Dollár P (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Kreiss S, Bertoni L, Alahi A (2019) Pifpaf: composite fields for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11977–11986
Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750
Li Y, Li K, Wang X (2020) Recognizing actions in images by fusing multiple body structure cues. Pattern Recogn 104:107341
Article Google Scholar
Lv X, Zhang W (2021) Student action recognition and early warning machine based on online class. In: 2021 IEEE 3rd international conference on frontiers technology of information and computer (ICFTIC). IEEE, pp 154–157
Mohammadi S, Majelan SG, Shokouhi SB (2019) Ensembles of deep neural networks for action recognition in still images. In: 2019 9th international conference on computer and knowledge engineering (ICCKE). IEEE, pp 315–318
Pei J, Shan P (2019) A micro-expression recognition algorithm for students in classroom learning based on convolutional neural network. Traitement du Signal 36(6)
Pise A, Vadapalli H, Sanders I (2020) Facial emotion recognition using temporal relational network: an application to E-learning. Multimed Tools Appl:1–21
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779– 788
Ren S, He K, Girshick R et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances Neural Inf Process Syst 28
Su K, Yu D, Xu Z et al (2019) Multi-person pose estimation with enhanced channel-wise and spatial information. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5674–5682
Sun K, Xiao B, Liu D et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5693–5703
Wei SE, Ramakrishna V, Kanade T et al (2016) Convolutional pose machines. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4724–4732
Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 466–481
Yan S, Smith JS, Lu W et al (2017) Multibranch attention networks for action recognition in still images. IEEE Trans Cogn Dev Syst 10(4):1116–1125
Article Google Scholar
Yolov5 [CP/OL]. [2020-05-30]. https://github.com/ultralytics/yolov5. Accessed 8 July 2021
Zhao J, Li J, Jia J (2021) A study on posture-based teacher-student behavioral engagement pattern. Sustain Cities Soc 67:2749
Article Google Scholar
Zhang YW, Wu Z, Chen XJ et al (2020) Classroom behavior recognition based on improved yolov3. In: 2020 international conference on artificial intelligence and education (ICAIE). IEEE, pp 93–97
Zheng Y, Zheng X, Lu X et al (2020) Spatial attention based visual semantic learning for action recognition in still images. Neurocomputing 413:383–396
Article Google Scholar
Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv:1904.07850

Download references

Funding

This work was supported by The National Natural Science Foundation of China (Grant Number 62177012, 62001133, and 61967005). Innovation Project of GUET Graduate Education (Grant Number 2021YCXS027).

Author information

Authors and Affiliations

School of Information and Communication, Guilin University of Electronic Technology, Guilin, 541001, China
Jianwen Mo, Rui Zhu, Hua Yuan & Zhaoyu Shou
Educational Technology Statistics, Guilin Institute of Information Technology, Guilin, 541001, China
Lingping Chen

Authors

Jianwen Mo
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hua Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyu Shou
View author publications
You can also search for this author in PubMed Google Scholar
Lingping Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hua Yuan.

Ethics declarations

Conflict of Interests

The authors declare no conflict of interest.

Additional information

Availability of data and materials

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mo, J., Zhu, R., Yuan, H. et al. Student behavior recognition based on multitask learning. Multimed Tools Appl 82, 19091–19108 (2023). https://doi.org/10.1007/s11042-022-14100-7

Download citation

Received: 10 March 2022
Revised: 06 September 2022
Published: 25 November 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11042-022-14100-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Student behavior recognition based on multitask learning

Abstract

Access this article

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Availability of data and materials

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation