Abstract
In order to solve the problem of low detection efficiency and long working time in the traditional video surveillance system for abnormal behavior detection and identification methods. A multimodal abnormal behavior detection and identification method based on video surveillance is proposed and applied to an online video classroom concentration evaluation task for college students in English. The model works by capturing abnormal behaviors and facial expressions and building a joint network that fuses abnormal behaviors and facial expressions. By testing on two open-source datasets and self-built classroom real-time datasets, the results verify that the model in this paper has better recognition performance compared to current mainstream models while maintaining real-time performance. The model proposed in this paper provides a new way of thinking about building smart classrooms.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
27 August 2024
A Correction to this paper has been published: https://doi.org/10.1007/s12652-024-04845-4
References
Amiryousefi M (2019) The incorporation of flipped learning into conventional classes to enhance EFL learners’ L2 speaking, L2 listening, and engagement. Innov Lang Learn Teach 13(2):147–161
Arifani Y, Asari S, Anwar K, Budianto L (2020) Individual or collaborative whatsapp learning? A flipped classroom model of EFL writing instruction. Teach Engl Technol 20(1):122–139
Asmali M (2018) Integrating technology into ESP classes: Use of student response system in English for specific purposes instruction. Teach Engl Technol 18(3):86–104
Barra P, Mnasri Z, Greco D (2023a), July Multimodal Emotion Recognition from Voice and Video Signals. In IEEE EUROCON 2023-20th International Conference on Smart Technologies (pp. 169–174). IEEE
Barra P, Cantone AA, Francese R, Giammetti M, Sais R, Santosuosso OP, Vitiello G (2023b), August MetaCUX: Social Interaction and Collaboration in the Metaverse. In IFIP Conference on Human-Computer Interaction (pp. 528–532). Cham: Springer Nature Switzerland
Chen C-M, Wang J-Y (2018) Effects of online synchronous instruction with an attention monitoring and alarm mechanism on sustained attention and learning performance. Interact Learn Environ 26(4):427–443
Chen C, Wang J, Yu C (2017) Assessing the attention levels of students by using a novel attention aware system based on brainwave signals. Br J Edu Technol 48(2):348–369
Chien S-Y, Hwang G-J, Jong MS-Y (2020) Effects of peer assessment within the context of spherical video-based virtual reality on EFL students’ english-speaking performance and learning perceptions. Comput Educ 146:103751
Chuang H, Weng C, Chen C (2018) Which students benefit most from a flipped classroom approach to language learning? Br J Edu Technol 49(1):56–68
Dankwa S, Yang L (2021) An efficient and accurate depth-wise separable convolutional neural network for cybersecurity vulnerability assessment based on CAPTCHA breaking. Electronics 10(4):480
English LD, King D, Smeed J (2017) Advancing integrated STEM learning through engineering design: Sixth-grade students’ design and construction of earthquake resistant buildings. J Educational Res 110(3):255–271
Fatimah AS, Santiana S (2017) Teaching in 21st century: students-teachers’ perceptions of technology use in the classroom. Scr Journal: J Linguistic Engl Teach 2(2):125
Goharinejad S, Goharinejad S, Hajesmaeel-Gohari S, Bahaadinbeigy K (2022) The usefulness of virtual, augmented, and mixed reality technologies in the diagnosis and treatment of attention deficit hyperactivity disorder in children: an overview of relevant studies. BMC Psychiatry 22(1):1–13
Hodgson TR, Cunningham A, McGee D, Kinne LJ, Murphy TJ (2017) Assessing behavioral engagement in flipped and non-flipped mathematics classrooms: teacher abilities and other potential factors. Int J Educ Math Sci Technol 5(4):248–261
Jia N, Zheng C, Sun W (2022) A multimodal emotion recognition model integrating speech, video and MoCAP. Multimedia Tools Appl 81(22):32265–32286
Jiang L, Ren W (2021) Digital multimodal composing in L2 learning: ideologies and impact. J Lang Identity Educ 20(3):167–182
Kabooha R, Elyas T (2018) The effects of YouTube in multimedia instruction for vocabulary learning: perceptions of EFL students and teachers. Engl Lang Teach 11(2):72–81
Kizi GMG, Shadjalilovna SM (2022) Developing diagnostic assessment, assessment for learning and assessment of learning competence via task based language teaching. Academicia Globe: Inderscience Res 3(04):34–38
Köroglu ZÇ, Çakir A (2017) Implementation of flipped instruction in language classrooms: an alternative way to develop speaking skills of pre-service English language teachers. Int J Educ Dev Using Inform Communication Technol 13(2):42–55
Kuo Y-C, Chu H-C, Tsai M-C (2017) Effects of an integrated physiological signal-based attention-promoting and English listening system on students’ learning performance and behavioral patterns. Comput Hum Behav 75:218–227
Leontjev D, DeBoer MA (2022) Multimodal mediational means in assessment of processes: an argument for a hard-CLIL approach. Int J Bilingual Educ Biling 25(4):1275–1291
Lim FV, Toh W, Nguyen TTH (2022) Multimodality in the English language classroom: a systematic review of literature. Linguistics Educ 69(1):101048
Liu T, Yu S, Xu B, Yin H (2018) Recurrent networks with attention and convolutional networks for sentence representation and classification. Appl Intell 48:3797–3806
Mercer N, Warwick P, Ahmed A (2017) An oracy assessment toolkit: linking research and development in the assessment of students’ spoken language skills at age 11–12. Learn Instruction 48:51–60
Nash BL, Brady RB (2022) Video games in the secondary English language arts classroom: a state-of‐the‐art review of the literature. Reading Res Q 57(3):957–981
Shadiev R, Huang Y-M, Hwang J-P (2017) Investigating the effectiveness of speech-to-text recognition applications on learning performance, attention, and meditation. Education Tech Research Dev 65:1239–1261
Shadiev R, Wu T-T, Huang Y-M (2018) Enhancing learning performance, attention, and meditation using a speech-to-text recognition application: evidence from multiple data sources. Learning analytics. Routledge, pp 107–119
Shohel Parvez M, Tasnim N, Talapatra S, Ruhani A, Hoque ASMM (2022) Assessment of musculoskeletal problems among Bangladeshi University students in relation to classroom and library furniture. J Institution Eng (India): Ser C, 1–14
Smith BE, Pacheco MB, Khorosheva M (2021) Emergent bilingual students and digital multimodal composition: a systematic review of research in secondary classrooms. Reading Res Q 56(1):33–52
von Aufschnaiter C, Alonzo AC (2018) Foundations of formative assessment: introducing a learning progression to guide preservice physics teachers’ video-based interpretation of student thinking. Appl Measur Educ 31(2):113–127
Wang S-H, Zhou Q, Yang M, Zhang Y-D (2021) ADVIAN: Alzheimer’s disease VGG-inspired attention network based on convolutional block attention module and multiple way data augmentation. Front Aging Neurosci 13:687456
Wulff P, Buschhüter D, Westphal A, Mientus L, Nowak A, Borowski A (2022) Bridging the gap between qualitative and quantitative assessment in science education research with machine learning—A case for pretrained language models-based clustering. J Sci Edu Technol 31(4):490–513
Zainuddin Z, Perera CJ (2019) Exploring students’ competence, autonomy and relatedness in the flipped classroom pedagogical model. J Furth High Educ 43(1):115–126
Zhao Y, Chen J, Xu X, Lei J, Zhou W (2021) SEV-Net: residual network embedded with attention mechanism for plant disease severity detection. Concurrency Computation: Pract Experience, 33(10), e6161
Zou S (2017) Designing and practice of a college English teaching platform based on artificial intelligence. J Comput Theor Nanosci 14(1):104–108
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Miao, Q., Li, L. & Wu, D. An English video teaching classroom attention evaluation model incorporating multimodal information. J Ambient Intell Human Comput 15, 3067–3079 (2024). https://doi.org/10.1007/s12652-024-04800-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-024-04800-3