Skip to main content

Advertisement

Log in

Student behavior recognition based on multitask learning

  • Track 5: Multimedia and Education
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The assessment of students’ classroom behavior is an important part of classroom teaching evaluation. However, teachers cannot timely, objectively and accurately evaluate the listening status of each student in the class. We offer a multitask classroom behavior recognition method that combines human pose estimation and object detection. First, the target detector extracts the individual region from the keyframe as the network’s input. Then, the multitask heatmap network (MTHN) module extracts the intermediate heatmap of multiscale feature association. The attitude estimation and target detection tasks are constructed by mapping relations to obtain the keypoints and object position information. Finally, the keypoints behavior vector and the metric vector are used to model the behavior, and a classroom behavior detection algorithm based on the fully connected network is designed. Additionally, we created a classroom dataset with pose estimation, objects, and behavior labels. Meanwhile, transfer learning is used to solve the problem of insufficient sample size. After several experiments, we show that the detection accuracy of the proposed multitask learning-based student behavior recognition algorithm reaches more than 90%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934

  2. Cai Z, Vasconcelos N (2018) Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  3. Cao Z, Simon T, Wei SE et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291– 7299

  4. Carreira J, Zisserman A (2017) Quo vadis, action recognition a new model and the kinetics dataset. In: proceedings of the IEEE conference on computer vision and pattern recognition. pp 6299–6308

  5. Chen Y, Wang Z, Peng Y et al (2018) Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7103– 7112

  6. Cheng B, Xiao B, Wang J et al (2020) Higherhrnet: scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5386–5395

  7. COCO: COCO Leader Board. http://cocodataset.org. Accessed 14 June 2021

  8. Feichtenhofer C (2020) X3d: expanding architectures for efficient video recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 203–213

  9. Feichtenhofer C, Fan H, Malik J et al (2019) Slowfast networks for video recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6202–6211

  10. Fu R, Wu T, Luo Z et al (2019) Learning behavior analysis in classroom based on deep learning. In: 2019 tenth international conference on intelligent control and information processing (ICICIP). IEEE, pp 206–212

  11. Ge Z, Liu S, Wang F et al (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430

  12. Huang W, Li N, Qiu Z et al (2020) An automatic recognition method for students’ classroom behaviors based on image processing. Traitement du Signal 37(3)

  13. Kaiming H, Gkioxari G, Dollár P (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  14. Kreiss S, Bertoni L, Alahi A (2019) Pifpaf: composite fields for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11977–11986

  15. Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750

  16. Li Y, Li K, Wang X (2020) Recognizing actions in images by fusing multiple body structure cues. Pattern Recogn 104:107341

    Article  Google Scholar 

  17. Lv X, Zhang W (2021) Student action recognition and early warning machine based on online class. In: 2021 IEEE 3rd international conference on frontiers technology of information and computer (ICFTIC). IEEE, pp 154–157

  18. Mohammadi S, Majelan SG, Shokouhi SB (2019) Ensembles of deep neural networks for action recognition in still images. In: 2019 9th international conference on computer and knowledge engineering (ICCKE). IEEE, pp 315–318

  19. Pei J, Shan P (2019) A micro-expression recognition algorithm for students in classroom learning based on convolutional neural network. Traitement du Signal 36(6)

  20. Pise A, Vadapalli H, Sanders I (2020) Facial emotion recognition using temporal relational network: an application to E-learning. Multimed Tools Appl:1–21

  21. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  22. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767

  23. Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779– 788

  24. Ren S, He K, Girshick R et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances Neural Inf Process Syst 28

  25. Su K, Yu D, Xu Z et al (2019) Multi-person pose estimation with enhanced channel-wise and spatial information. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5674–5682

  26. Sun K, Xiao B, Liu D et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5693–5703

  27. Wei SE, Ramakrishna V, Kanade T et al (2016) Convolutional pose machines. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4724–4732

  28. Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 466–481

  29. Yan S, Smith JS, Lu W et al (2017) Multibranch attention networks for action recognition in still images. IEEE Trans Cogn Dev Syst 10(4):1116–1125

    Article  Google Scholar 

  30. Yolov5 [CP/OL]. [2020-05-30]. https://github.com/ultralytics/yolov5. Accessed 8 July 2021

  31. Zhao J, Li J, Jia J (2021) A study on posture-based teacher-student behavioral engagement pattern. Sustain Cities Soc 67:2749

    Article  Google Scholar 

  32. Zhang YW, Wu Z, Chen XJ et al (2020) Classroom behavior recognition based on improved yolov3. In: 2020 international conference on artificial intelligence and education (ICAIE). IEEE, pp 93–97

  33. Zheng Y, Zheng X, Lu X et al (2020) Spatial attention based visual semantic learning for action recognition in still images. Neurocomputing 413:383–396

    Article  Google Scholar 

  34. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv:1904.07850

Download references

Funding

This work was supported by The National Natural Science Foundation of China (Grant Number 62177012, 62001133, and 61967005). Innovation Project of GUET Graduate Education (Grant Number 2021YCXS027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua Yuan.

Ethics declarations

Conflict of Interests

The authors declare no conflict of interest.

Additional information

Availability of data and materials

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mo, J., Zhu, R., Yuan, H. et al. Student behavior recognition based on multitask learning. Multimed Tools Appl 82, 19091–19108 (2023). https://doi.org/10.1007/s11042-022-14100-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-14100-7

Keywords

Navigation