Skip to main content

Classroom Attention Estimation Method Based on Mining Facial Landmarks of Students

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13142))

Included in the following conference series:

Abstract

Classroom attention estimation aims to capture the multi-modal semantic information contained in the teaching situation and analyze the level of concentration and participation of students in the classroom. However, it is a challenge to mine different modal information in non-experimental real teaching scenes to construct a unified attention mode. In order to advance these researches, this paper proposes a new method of automatically estimating attention through facial feature points. This method uses face detection and face alignment algorithms to capture 68 landmarks on student faces in classroom videos, and introduces face reference information to constrain landmarks and extract feature sets. The purpose is to reduce the sensitivity of the attention model to differences in different face information. The automatic evaluation module uses machine learning algorithms to train the classifier to estimate the individual student's attention level. In a large number of experiments conducted on multiple real classroom video data, our three-level attention classifier achieves an accuracy of 82.5%, which can achieve better results than other studies in the field of student participation analysis. The results show that the method based on facial landmark mining can more accurately predict the individual student's classroom attention level, and can be used as a non-intrusive automatic analysis method for real classroom multimedia data analysis.

L. Chen and H. Yang—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rivera-Pelayo, V., Munk, J., Zacharias, V., Braun, S.: Live interest meter – learning from quantified feedback in mass lectures. In: International Conference on Learning Analytics & Knowledge, pp. 23–27 (2013)

    Google Scholar 

  2. Raca, M., Tormey, R., Dillenbourg, P.: Sleepers’ lag-study on motion and attention. In: Proceedings of the Fourth International Conference on Learning Analytics and Knowledge, pp. 36–43. ACM (2014)

    Google Scholar 

  3. Zaletelj, J., Košir, A.: Predicting students’ attention in the classroom from Kinect facial and body features. J. Image Video Process. 2017, 80 (2017). https://doi.org/10.1186/s13640-017-0228-8

  4. Monkaresi, H., Bosch, N., Calvo, R.A., D'Mello, S.K.: Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Trans. Affect. Comput. 8(1), 15–28 (2017). https://doi.org/10.1109/TAFFC.2016.2515084

  5. Xu, X., Teng, X.: Classroom attention analysis based on multiple euler angles constraint and head pose estimation. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11961, pp. 329–340. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37731-1_27

    Chapter  Google Scholar 

  6. Zheng, R., Jiang, F., Shen, R.: Intelligent student behavior analysis system for real classrooms. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 9244–9248 (2020). https://doi.org/10.1109/ICASSP40776.2020.9053457

  7. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016). https://doi.org/10.1109/LSP.2016.2603342

    Article  Google Scholar 

  8. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

  9. Najibi, M., Samangouei, P., Chellappa, R., Davis, L.S.: SSH: single stage headless face detector. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4885–4894 (2017). https://doi.org/10.1109/ICCV.2017.522

  10. Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013). https://doi.org/10.1109/CVPR.2013.75

  11. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014). https://doi.org/10.1109/CVPR.2014.241

  12. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7

    Chapter  Google Scholar 

  13. Wang, X., Li, X., Wu, S.: Graph structure reasoning network for face alignment and reconstruction. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12572, pp. 493–505. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67832-6_40

    Chapter  Google Scholar 

  14. Shao, Z., Ding, S., Zhu, H., Wang, C., Ma, L.: Face alignment by deep convolutional network with adaptive learning rate. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1283–1287 (2016). https://doi.org/10.1109/ICASSP.2016.7471883

  15. Grafsgaard, J.F., et al.: The additive value of multimodal features for predicting engagement, frustration, and learning during tutoring. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 42–49. ACM (2014)

    Google Scholar 

  16. Whitehill, J., Serpell, Z., Lin, Y., Foster, A., Movellan, J.R.: The faces of engagement: automatic recognition of student engagementfrom facial expressions. IEEE Trans. Affect. Comput. 5(1), 86–98 (2014). https://doi.org/10.1109/TAFFC.2014.2316163

  17. Yang, X., Kim, Y.-J., Taub, M., Azevedo, R., Chi, M.: PRIME: block-wise missingness handling for multi-modalities in intelligent tutoring systems. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 63–75. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_6

    Chapter  Google Scholar 

  18. Yang, S., Luo, P., Loy, C.C., Tang, X.: WIDER FACE: a face detection benchmark. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5525–5533 (2016). https://doi.org/10.1109/CVPR.2016.596

  19. Zhu, J., Liu, Y., Zhang, L.: 3D face reconstruction based on geometric transformation. In: 2012 International Conference on Virtual Reality and Visualization, pp. 46–49 (2012). https://doi.org/10.1109/ICVRV.2012.10

  20. Su, P., Drysdale, R.L.S.: A comparison of sequential delaunay triangulation algorithms. Comput. Geom. Theory Appl. 7, 361–358 (1997)

    Google Scholar 

  21. Li, X., Chen, Z., Yang, F.: Exploring of clustering algorithm on class-imbalanced data. In: 2013 8th International Conference on Computer Science & Education, pp. 89–93 (2013). https://doi.org/10.1109/ICCSE.2013.6553890

  22. Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 146–155 (2016). https://doi.org/10.1109/CVPR.2016.23

  23. Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 397–403 (2013). https://doi.org/10.1109/ICCVW.2013.59

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61772023), National Key Research and Development Program of China (No. 2019QY1803), Fujian Science and Technology Plan Industry-University-Research Cooperation Project (No.2021H6015), the National College Student Innovation and Entrepreneurship Training Program of China (202110384258) and The Social Science Program of Fujian Province (FJ2020B062).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kunhong Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, L., Yang, H., Liu, K. (2022). Classroom Attention Estimation Method Based on Mining Facial Landmarks of Students. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-98355-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-98354-3

  • Online ISBN: 978-3-030-98355-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics