skip to main content
research-article

Integrating Gaze and Mouse Via Joint Cross-Attention Fusion Net for Students' Activity Recognition in E-learning

Published: 27 September 2023 Publication History

Abstract

E-learning has emerged as an indispensable educational mode in the post-epidemic era. However, this mode makes it difficult for students to stay engaged in learning without appropriate activity monitoring. Our work explores a promising solution that combines gaze and mouse data to recognize students' activities, thereby facilitating activity monitoring and analysis during e-learning. We initially surveyed 200 students from a local university, finding more acceptance for eye trackers and mouse loggers compared to video surveillance. We then designed eight students' routine digital activities to collect a multimodal dataset and analyze the patterns and correlations between gaze and mouse across various activities. Our proposed Joint Cross-Attention Fusion Net, a multimodal activity recognition framework, leverages the gaze-mouse relationship to yield improved classification performance by integrating cross-modal representations through a cross-attention mechanism and integrating the joint features that characterize gaze-mouse coordination. Evaluation results show that our method can achieve up to 94.87% F1-score in predicting 8-classes activities, with an improvement of at least 7.44% over using gaze or mouse data independently. This research illuminates new possibilities for monitoring student engagement in intelligent education systems, also suggesting a promising strategy for melding perception and action modalities in behavioral analysis across a range of ubiquitous computing environments.

References

[1]
Yomna Abdelrahman, Anam Ahmad Khan, Joshua Newn, Eduardo Velloso, Sherine Ashraf Safwat, James Bailey, Andreas Bulling, Frank Vetere, and Albrecht Schmidt. 2019. Classifying attention types with thermal imaging and eye tracking. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 1--27.
[2]
Karan Ahuja, Deval Shah, Sujeath Pareddy, Franceska Xhakaj, Amy Ogan, Yuvraj Agarwal, and Chris Harrison. 2021. Classroom Digital Twins with Instrumentation-Free Gaze Tracking (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 484, 9 pages. https://doi.org/10.1145/3411764.3445711
[3]
Dimah Al-Fraihat, Mike Joy, Jane Sinclair, et al. 2020. Evaluating E-learning systems success: An empirical study. Computers in human behavior 102 (2020), 67--86.
[4]
Hans-Joachim Bieg, Lewis L Chuang, Roland W Fleming, Harald Reiterer, and Heinrich H Bülthoff. 2010. Eye and pointer coordination in search and selection tasks. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications. 89--92.
[5]
B Biguer, Marc Jeannerod, and C Prablanc. 1982. The coordination of eye, head, and arm movements during reaching at a single visual target. Experimental brain research 46, 2 (1982), 301--304.
[6]
Gordon Binsted, Romeo Chua, Werner Helsen, and Digby Elliott. 2001. Eye--hand coordination in goal-directed aiming. Human movement science 20, 4-5 (2001), 563--585.
[7]
Andreas Bulling and Daniel Roggen. 2011. Recognition of visual memory recall processes using eye movement analysis. In Proceedings of the 13th international conference on Ubiquitous computing. 455--464.
[8]
Andreas Bulling, Daniel Roggen, and Gerhard Troester. 2011. What's in the Eyes for Context-Awareness? IEEE Pervasive Computing 10, 2 (2011), 48--57. https://doi.org/10.1109/MPRV.2010.49
[9]
Andreas Bulling, Jamie A Ward, Hans Gellersen, and Gerhard Tröster. 2009. Eye movement analysis for activity recognition. In Proceedings of the 11th international conference on Ubiquitous computing. 41--50.
[10]
Andreas Bulling, Jamie A Ward, Hans Gellersen, and Gerhard Tröster. 2010. Eye movement analysis for activity recognition using electrooculography. IEEE transactions on pattern analysis and machine intelligence 33, 4 (2010), 741--753.
[11]
Cátia Cepeda, Maria Camila Dias, Dina Rindlisbacher, Marcus Cheetham, and Hugo Gamboa. 2020. Eye-pointer Coordination in a Decision-making Task Under Uncertainty. In BIOSIGNALS. 30--37.
[12]
Kaixuan Chen, Dalin Zhang, Lina Yao, Bin Guo, Zhiwen Yu, and Yunhao Liu. 2021. Deep Learning for Sensor-Based Human Activity Recognition: Overview, Challenges, and Opportunities. ACM Comput. Surv. 54, 4, Article 77 (may 2021), 40 pages. https://doi.org/10.1145/3447744
[13]
Liming Chen, Chris D. Nugent, and Hui Wang. 2012. A knowledge-driven approach to activity recognition in smart homes. IEEE Transactions on Knowledge and Data Engineering 24, 6 (2012), 961--974. https://doi.org/10.1109/TKDE.2011.51
[14]
Mon Chu Chen, John R Anderson, and Myeong Ho Sohn. 2001. What can a mouse cursor tell us more? Correlation of eye/mouse movements on web browsing. In CHI'01 extended abstracts on Human factors in computing systems. 281--282.
[15]
Zhilong Chen, Hancheng Cao, Yuting Deng, Xuan Gao, Jinghua Piao, Fengli Xu, Yu Zhang, and Yong Li. 2021. Learning from home: A mixed-methods analysis of live streaming based remote education experience in Chinese colleges during the COVID-19 pandemic. In Proceedings of the 2021 CHI Conference on human factors in computing systems. 1--16.
[16]
Mihaela Cocea and Stephan Weibelzahl. 2010. Disengagement detection in online learning: Validation studies and perspectives. IEEE transactions on learning technologies 4, 2 (2010), 114--124.
[17]
Franois Courtemanche, Esma Aïmeur, Aude Dufresne, Mehdi Najjar, and Franck Mpondo. 2011. Activity recognition using eye-gaze movements and traditional interactions. Interacting with Computers 23 (2011), 202--213. Issue 3. https://doi.org/10.1016/j.intcom.2011.02.008
[18]
Javier De Lope and Manuel Grana. 2020. Behavioral activity recognition based on gaze ethograms. International Journal of Neural Systems 30, 07 (2020), 2050025.
[19]
Shujie Deng, Jian Chang, Julie A Kirkby, and Jian J Zhang. 2016. Gaze--mouse coordinated movements and dependency with coordination demands in tracing. Behaviour & Information Technology 35, 8 (2016), 665--679.
[20]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[21]
M. Ali Akber Dewan, Mahbub Murshed, and Fuhua Lin. 2019. Engagement detection in online learning: a review. Smart Learning Environments 6 (12 2019), 1. Issue 1. https://doi.org/10.1186/s40561-018-0080-z
[22]
Elena Di Lascio, Shkurta Gashi, Maike E. Debus, and Silvia Santini. 2021. Automatic Recognition of Flow During Work Activities Using Context and Physiological Signals. In 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII). 1--8. https://doi.org/10.1109/ACII52823.2021.9597434
[23]
Elena Di Lascio, Shkurta Gashi, Juan Sebastian Hidalgo, Beatrice Nale, Maike E Debus, and Silvia Santini. 2020. A multi-sensor approach to automatically recognize breaks and work activities of knowledge workers in academia. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1--20.
[24]
Elena Di Lascio, Shkurta Gashi, and Silvia Santini. 2018. Unobtrusive Assessment of Students' Emotional Engagement during Lectures Using Electrodermal Activity Sensors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3 (2018), 1--21. https://doi.org/10.1145/3264913
[25]
He Du, Zhiwen Yu, Dong Xiao, Zhu Wang, Qi Han, and Bin Guo. 2017. Sensing keyboard input for computer activity recognition with a smartphone. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers. 25--28.
[26]
Andrew T Duchowski, Krzysztof Krejtz, Izabela Krejtz, Cezary Biele, Anna Niedzielska, Peter Kiefer, Martin Raubal, and Ioannis Giannopoulos. 2018. The index of pupillary activity: Measuring cognitive load vis-à-vis task difficulty with pupil oscillation. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1--13.
[27]
Selina N Emhardt, Halszka Jarodzka, Saskia Brand-Gruwel, Christian Drumm, Diederick C Niehorster, and Tamara van Gog. 2022. What is my teacher talking about? Effects of displaying the teacher's gaze and mouse cursor cues in video lectures on students' learning. Journal of Cognitive Psychology 34, 7 (2022), 846--864.
[28]
Wolfgang Fuhl, Nikolai Sanamrad, and Enkelejda Kasneci. 2021. The gaze and mouse signal as additional source for user fingerprints in browser applications. arXiv preprint arXiv:2101.03793 (2021).
[29]
Chenyang Gao, Ivan Marsic, Aleksandra Sarcevic, Waverly Gestrich-Thompson, and Randall S Burd. 2023. Real-time Context-Aware Multimodal Network for Activity and Activity-Stage Recognition from Team Communication in Dynamic Clinical Settings. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 1 (2023), 1--28.
[30]
Nan Gao, Wei Shao, Mohammad Saiedur Rahaman, and Flora D. Salim. 2020. N-Gage: Predicting in-Class Emotional, Behavioural and Cognitive Engagement in the Wild. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 3, Article 79 (sep 2020), 26 pages. https://doi.org/10.1145/3411813
[31]
K. Gidlf, A. Wallin, R. Dewhurst, and K. Holmqvist. 2013. Using Eye Tracking to Trace a Cognitive Process: Gaze Behaviour During Decision Making in a Natural Environment. Journal of Eye Movement Research 6, 1 (2013), 613--619.
[32]
Benjamin S Goldberg, Robert A Sottilare, Keith W Brawner, and Heather K Holden. 2011. Predicting learner engagement during well-defined and ill-defined computer-based intercultural interactions. In International Conference on Affective Computing and Intelligent Interaction. Springer, 538--547.
[33]
Qi Guo and Eugene Agichtein. 2010. Towards Predicting Web Searcher Gaze Position from Mouse Movements. In CHI '10 Extended Abstracts on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI EA '10). Association for Computing Machinery, New York, NY, USA, 3601--3606. https://doi.org/10.1145/1753846.1754025
[34]
Mariam Hassib, Stefan Schneegass, Philipp Eiglsperger, Niels Henze, Albrecht Schmidt, and Florian Alt. 2017. EngageMeter: A System for Implicit Audience Engagement Sensing Using Electroencephalography. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). Association for Computing Machinery, New York, NY, USA, 5114--5119. https://doi.org/10.1145/3025453.3025669
[35]
Chengkun He, Jie Shao, and Jiayu Sun. 2018. An anomaly-introduced learning method for abnormal event detection. Multimedia Tools and Applications 77, 22 (2018), 29573--29588.
[36]
Fengxiang He, Tongliang Liu, and Dacheng Tao. 2020. Why resnet works? residuals generalize. IEEE transactions on neural networks and learning systems 31, 12 (2020), 5349--5362.
[37]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[38]
Curtis R. Henrie, Robert Bodily, Ross Larsen, and Charles R. Graham. 2018. Exploring the potential of LMS log data as a proxy measure of student engagement. Journal of Computing in Higher Education 30 (8 2018), 344--362. Issue 2. https://doi.org/10.1007/s12528-017-9161-1
[39]
Brad Hodge, Brad Wright, and Pauleen Bennett. 2018. The role of grit in determining engagement and academic outcomes for university students. Research in Higher Education 59, 4 (2018), 448--460.
[40]
Jeff Huang, Ryen White, and Georg Buscher. 2012. User see, user pointGaze and Cursor Alignment in Web Search. In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI '12. ACM Press, New York, New York, USA, 1341. https://doi.org/10.1145/2207676.2208591
[41]
Jeff Huang, Ryen W White, and Susan Dumais. 2011. No clicks, no problem: using cursor movements to understand and improve search. In Proceedings of the SIGCHI conference on human factors in computing systems. 1225--1234.
[42]
Stephen Hutt, Kristina Krasich, James R. Brockmole, and Sidney K. D'Mello. 2021. Breaking out of the Lab: Mitigating Mind Wandering with Gaze-Based Attention-Aware Technology in Classrooms. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 52, 14 pages. https://doi.org/10.1145/3411764.3445269
[43]
Stephen Hutt, Caitlin Mills, Nigel Bosch, Kristina Krasich, James Brockmole, and Sidney D'mello. 2017. "Out of the Fr-Eye-ing Pan" Towards Gaze-Based Models of Attention during Learning with Technology in the Classroom. In Proceedings of the 25th conference on user modeling, adaptation and personalization. 94--103.
[44]
Md Rabiul Islam, Shuji Sakamoto, Yoshihiro Yamada, Andrew W Vargo, Motoi Iwata, Masakazu Iwamura, and Koichi Kise. 2021. Self-supervised Learning for Reading Activity Classification. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 3 (2021), 1--22.
[45]
Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review. Data mining and knowledge discovery 33, 4 (2019), 917--963.
[46]
Roland S Johansson, Göran Westling, Anders Bäckström, and J Randall Flanagan. 2001. Eye--hand coordination in object manipulation. Journal of neuroscience 21, 17 (2001), 6917--6932.
[47]
Harmanpreet Kaur, Alex C. Williams, Daniel McDuff, Mary Czerwinski, Jaime Teevan, and Shamsi T. Iqbal. 2020. Optimizing for Happiness and Productivity: Modeling Opportune Moments for Transitions and Breaks at Work. Conference on Human Factors in Computing Systems - Proceedings (2020), 1--15. https://doi.org/10.1145/3313831.3376817
[48]
Ilan Kirsh. 2020. Directions and Speeds of Mouse Movements on a Website and Reading Patterns: A Web Usage Mining Case Study. In Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics (Biarritz, France) (WIMS 2020). Association for Computing Machinery, New York, NY, USA, 129--138. https://doi.org/10.1145/3405962.3405982
[49]
Ilan Kirsh and Mike Joy. 2020. Exploring Pointer Assisted Reading (PAR): Using Mouse Movements to Analyze Web Users' Reading Behaviors and Patterns. In HCI International 2020 - Late Breaking Papers: Multimodality and Intelligence, Constantine Stephanidis, Masaaki Kurosu, Helmut Degen, and Lauren Reinerman-Jones (Eds.). Springer International Publishing, Cham, 156--173.
[50]
Saskia Koldijk, Mark van Staalduinen, Mark Neerincx, and Wessel Kraaij. 2012. Real-Time Task Recognition Based on Knowledge Workers' Computer Activities. In Proceedings of the 30th European Conference on Cognitive Ergonomics (Edinburgh, United Kingdom) (ECCE '12). Association for Computing Machinery, New York, NY, USA, 152--159. https://doi.org/10.1145/2448136.2448170
[51]
Thomas Kosch, Mariam Hassib, Paweł W Woźniak, Daniel Buschek, and Florian Alt. 2018. Your eyes tell: Leveraging smooth pursuit for assessing cognitive workload. In Proceedings of the 2018 chi conference on human factors in computing systems. 1--13.
[52]
Manu Kumar, Jeff Klingner, Rohan Puranik, Terry Winograd, and Andreas Paepcke. 2008. Improving the accuracy of gaze input for interaction. In Proceedings of the 2008 symposium on Eye tracking research & applications. 65--68.
[53]
Kai Kunze, Yuzuko Utsumi, Yuki Shiga, Koichi Kise, and Andreas Bulling. 2013. I know what you are reading - recognition of document types using mobile eye tracking. ISWC 2013 - Proceedings of the 2013 ACM International Symposium on Wearable Computers (2013), 113--116. https://doi.org/10.1145/2493988.2494354
[54]
Tiffany C K Kwok, Eugene Yujun Fu, Erin You Wu, Michael Xuelin Huang, Grace Ngai, and Hong-Va Leong. 2018. Every Little Movement Has a Meaning of Its Own: Using Past Mouse Movements to Predict the Next Interaction. In 23rd International Conference on Intelligent User Interfaces. 397--401.
[55]
Guohao Lan, Bailey Heit, Tim Scargill, and Maria Gorlatova. 2020. GazeGraph: Graph-based few-shot cognitive context sensing from human visual behavior. SenSys 2020 - Proceedings of the 2020 18th ACM Conference on Embedded Networked Sensor Systems (2020), 422--435. https://doi.org/10.1145/3384419.3430774
[56]
Jiajia Li, Grace Ngai, Hong Va Leong, and Stephen CF Chan. 2016. Multimodal human attention detection for reading from facial expression, eye gaze, and mouse dynamics. ACM SIGAPP Applied Computing Review 16, 3 (2016), 37--49.
[57]
Jonathan Liebers, Patrick Horn, Christian Burschik, Uwe Gruenefeld, and Stefan Schneegass. 2021. Using gaze behavior and head orientation for implicit identification in virtual reality. In Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology. 1--9.
[58]
Daniel J. Liebling and Susan T. Dumais. 2014. Gaze and Mouse Coordination in Everyday Work (UbiComp '14 Adjunct). Association for Computing Machinery, New York, NY, USA, 1141--1150. https://doi.org/10.1145/2638728.2641692
[59]
Shengzhong Liu, Shuochao Yao, Jinyang Li, Dongxin Liu, Tianshi Wang, Huajie Shao, and Tarek Abdelzaher. 2020. GlobalFusion: A global attentional deep learning framework for multisensor information fusion. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1--27. https://doi.org/10.1145/3380999
[60]
Christof Lutteroth, Moiz Penkar, and Gerald Weber. 2015. Gaze vs. mouse: A fast and accurate gaze-only click alternative. In Proceedings of the 28th annual ACM symposium on user interface software & technology. 385--394.
[61]
Mathias N. Lystbæk, Peter Rosenberg, Ken Pfeuffer, Jens Emil Grønbæk, and Hans Gellersen. 2022. Gaze-Hand Alignment: Combining Eye Gaze and Mid-Air Pointing for Interacting with Menus in Augmented Reality. Proc. ACM Hum.-Comput. Interact. 6, ETRA, Article 145 (may 2022), 18 pages. https://doi.org/10.1145/3530886
[62]
Shuai Ma, Taichang Zhou, Fei Nie, and Xiaojuan Ma. 2022. Glancee: An Adaptable System for Instructors to Grasp Student Learning Status in Synchronous Online Classes. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI '22). Association for Computing Machinery, New York, NY, USA, Article 313, 25 pages. https://doi.org/10.1145/3491102.3517482
[63]
Abdelsalam M Maatuk, Ebitisam K Elberkawi, Shadi Aljawarneh, Hasan Rashaideh, and Hadeel Alharbi. 2022. The COVID-19 pandemic and E-learning: challenges and opportunities from the perspective of students and instructors. 34 (2022), 21--38. https://doi.org/10.1007/s12528-021-09274-2
[64]
Gloria Mark, Shamsi T. Iqbal, Mary Czerwinski, and Paul Johns. 2014. Bored Mondays and Focused Afternoons: The Rhythm of Attention and Online Activity in the Workplace. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Toronto, Ontario, Canada) (CHI '14). Association for Computing Machinery, New York, NY, USA, 3025--3034. https://doi.org/10.1145/2556288.2557204
[65]
André N Meyer, Laura E Barton, Gail C Murphy, Thomas Zimmermann, and Thomas Fritz. 2017. The work life of developers: Activities, switches and perceived productivity. IEEE Transactions on Software Engineering 43, 12 (2017), 1178--1193.
[66]
Andre N. Meyer, Chris Satterfield, Manuela Zuger, Katja Kevic, Gail C. Murphy, Thomas Zimmermann, and Thomas Fritz. 2020. Detecting Developers' Task Switches and Types. IEEE Transactions on Software Engineering (2020), 1--1. https://doi.org/10.1109/TSE.2020.2984086
[67]
Johannes Meyer, Adrian Frank, Thomas Schlebusch, and Enkeljeda Kasneci. 2021. A CNN-Based Human Activity Recognition System Combining a Laser Feedback Interferometry Eye Movement Sensor and an IMU for Context-Aware Smart Glasses. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 4 (2021), 1--24.
[68]
Alexandre Milisavljevic, Kevin Hamard, Coralie Petermann, Bernard Gosselin, Karine Doré-Mazars, and Matei Mancas. 2018. Eye and mouse coordination during task: from behaviour to prediction. In International Conference on Human Computer Interaction Theory and Applications. SCITEPRESS-Science and Technology Publications, 86--93.
[69]
Hamed Monkaresi, Nigel Bosch, Rafael A Calvo, and Sidney K D'Mello. 2016. Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Transactions on Affective Computing 8, 1 (2016), 15--28.
[70]
Prasanth Murali, Javier Hernandez, Daniel McDuff, Kael Rowan, Jina Suh, and Mary Czerwinski. 2021. AffectiveSpotlight: Facilitating the Communication of Affective Responses from Audience Members during Online Presentations (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 247, 13 pages. https://doi.org/10.1145/3411764.3445235
[71]
Vidhya Navalpakkam, LaDawn Jentzsch, Rory Sayres, Sujith Ravi, Amr Ahmed, and Alex Smola. 2013. Measurement and modeling of eye-mouse behavior in the presence of nonlinear page layouts. In Proceedings of the 22nd international conference on World Wide Web. 953--964.
[72]
Sebastiaan FW Neggers and Harold Bekkering. 2000. Ocular gaze is anchored to the target of an ongoing pointing movement. Journal of Neurophysiology 83, 2 (2000), 639--651.
[73]
Farzan Majeed Noori, Michael Riegler, Md Zia Uddin, and Jim Torresen. 2020. Human Activity Recognition from Multiple Sensors Data Using Multi-Fusion Representations and CNNs. ACM Trans. Multimedia Comput. Commun. Appl. 16, 2, Article 45 (may 2020), 19 pages. https://doi.org/10.1145/3377882
[74]
Henry Friday Nweke, Ying Wah Teh, Ghulam Mujtaba, and Mohammed Ali Al-Garadi. 2019. Data fusion and multiple classifier systems for human activity detection and health monitoring: Review and open research directions. Information Fusion 46 (2019), 147--170.
[75]
Chakradhar Pabba and Praveen Kumar. 2022. An intelligent system for monitoring students' engagement in large classroom teaching through facial expression recognition. Expert Systems 39, 1 (2022), e12839.
[76]
Erfan Pakdamanian, Shili Sheng, Sonia Baee, Seongkook Heo, Sarit Kraus, and Lu Feng. 2021. DeepTake: Prediction of Driver Takeover Behavior Using Multimodal Data. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 103, 14 pages. https://doi.org/10.1145/3411764.3445563
[77]
Aditya Prakash, Kashyap Chitta, and Andreas Geiger. 2021. Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7073--7083. https://doi.org/10.1109/CVPR46437.2021.00700
[78]
R Gnana Praveen, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro L Koerich, Simon Bacon, Patrick Cardinal, et al. 2022. A joint cross-attention model for audio-visual fusion in dimensional emotion recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2486--2495.
[79]
Valentin Radu, Catherine Tong, Sourav Bhattacharya, Nicholas D. Lane, Cecilia Mascolo, Mahesh K. Marina, and Fahim Kawsar. 2018. Multimodal Deep Learning for Activity and Context Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 4, Article 157 (jan 2018), 27 pages. https://doi.org/10.1145/3161174
[80]
Rachid Riad Saboundji and Robert Adrian Rill. 2020. Predicting Human Errors from Gaze and Cursor Movements. Proceedings of the International Joint Conference on Neural Networks (jul 2020), 1--8. https://doi.org/10.1109/IJCNN48605.2020.9207189
[81]
Thomais Rousoulioti, Dina Tsagari, and Christina Nicole Giannikas. 2022. Parents' New Role and Needs During the COVID-19 Educational Emergency. Interchange (2022), 1--27.
[82]
Mohamed Sathik and Sofia G Jonathan. 2013. Effect of facial expressions on student's comprehension recognition in virtual educational environments. SpringerPlus 2, 1 (2013), 1--9.
[83]
Carlos H Schenck, Scott R Bundlie, Andrea L Patterson, and Mark W Mahowald. 1987. Rapid eye movement sleep behavior disorder: a treatable parasomnia affecting older adults. Jama 257, 13 (1987), 1786--1789.
[84]
Chao Shen, Zhongmin Cai, Xiaohong Guan, Youtian Du, and Roy A Maxion. 2012. User authentication through mouse dynamics. IEEE Transactions on Information Forensics and Security 8, 1 (2012), 16--30.
[85]
Chao Shen, Zhongmin Cai, Xiaomei Liu, Xiaohong Guan, and Roy A Maxion. 2016. MouseIdentity: Modeling mouse-interaction behavior for a user verification system. IEEE Transactions on Human-Machine Systems 46, 5 (2016), 734--748.
[86]
Joo-Hyun Song and Ken Nakayama. 2009. Hidden cognitive states revealed in choice reaching tasks. Trends in cognitive sciences 13, 8 (2009), 360--366.
[87]
Yunpeng Song and Zhongmin Cai. 2022. Integrating Handcrafted Features with Deep Representations for Smartphone Authentication. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1 (2022), 1--27.
[88]
Namrata Srivastava, Joshua Newn, and Eduardo Velloso. 2018. Combining low and mid-level gaze features for desktop activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1--27.
[89]
Ben Steichen, Cristina Conati, and Giuseppe Carenini. 2014. Inferring visualization task properties, user performance, and user cognitive abilities from eye gaze data. ACM Transactions on Interactive Intelligent Systems 4, 2 (2014). https://doi.org/10.1145/2633043
[90]
David Stromback, Sangxia Huang, and Valentin Radu. 2020. Mm-fit Multimodal deep learning for automatic exercise logging across sensing devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4 (2020). Issue 4. https://doi.org/10.1145/3432701
[91]
Han Su, Shuncheng Liu, Bolong Zheng, Xiaofang Zhou, and Kai Zheng. 2020. A survey of trajectory distance measures and performance evaluation. The VLDB Journal 29 (2020), 3--32.
[92]
Wei Sun, Yunzhi Li, Feng Tian, Xiangmin Fan, and Hongan Wang. 2019. How Presenters Perceive and React to Audience Flow Prediction In-Situ: An Explorative Study of Live Online Lectures. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 162 (nov 2019), 19 pages. https://doi.org/10.1145/3359264
[93]
Praveen Sundar and S Kumar. 2016. Disengagement detection in online learning using log file analysis. International journal of computer technology and applications 9, 27 (2016), 195--301.
[94]
Luke Sy, Michael Raitor, Michael Del Rosario, Heba Khamis, Lauren Kark, Nigel H Lovell, and Stephen J Redmond. 2020. Estimating lower limb kinematics using a reduced wearable sensor count. IEEE Transactions on Biomedical Engineering 68, 4 (2020), 1293--1304.
[95]
Daniel Szafir and Bilge Mutlu. 2012. Pay Attention! Designing Adaptive Agents That Monitor and Improve User Engagement. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). Association for Computing Machinery, New York, NY, USA, 11--20. https://doi.org/10.1145/2207676.2207679
[96]
Yi Tay, Mostafa Dehghani, Dara Bahri, and Donald Metzler. 2022. Efficient transformers: A survey. Comput. Surveys 55, 6 (2022), 1--28.
[97]
Jia-Hua Tsai and Wei-Ta Chu. 2022. Multimodal Fusion with Cross-Modal Attention for Action Recognition in Still Images. In Proceedings of the 4th ACM International Conference on Multimedia in Asia (Tokyo, Japan) (MMAsia '22). Association for Computing Machinery, New York, NY, USA, Article 31, 5 pages. https://doi.org/10.1145/3551626.3564960
[98]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[99]
Jun Wang, Michael Xuelin Huang, Grace Ngai, and Hong Va Leong. 2017. Are you stressed? Your eyes and the mouse can tell. In 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 222--228.
[100]
Datong Wei, Chaofan Yang, Xiaolong (Luke) Zhang, and Xiaoru Yuan. 2021. Predicting Mouse Click Position Using Long Short-Term Memory Model Trained by Joint Loss Function. Conference on Human Factors in Computing Systems - Proceedings (2021). https://doi.org/10.1145/3411763.3451651
[101]
Huan Wei, Haotian Li, Meng Xia, Yong Wang, and Huamin Qu. 2020. Predicting student performance in interactive online question pools using mouse interaction features. In Proceedings of the tenth international conference on learning analytics & knowledge. 645--654.
[102]
Jacob Whitehill, Zewelanji Serpell, Yi-Ching Lin, Aysha Foster, and Javier R Movellan. 2014. The faces of engagement: Automatic recognition of student engagementfrom facial expressions. IEEE Transactions on Affective Computing 5, 1 (2014), 86--98.
[103]
Yang Xian, Xuejian Rong, Xiaodong Yang, and Yingli Tian. 2016. Evaluation of low-level features for real-world surveillance event detection. IEEE Transactions on Circuits and Systems for Video Technology 27, 3 (2016), 624--634.
[104]
Pingmei Xu, Yusuke Sugano, and Andreas Bulling. 2016. Spatio-temporal modeling and prediction of visual attention in graphical user interfaces. Conference on Human Factors in Computing Systems - Proceedings (2016), 3299--3310. https://doi.org/10.1145/2858036.2858479
[105]
Fei Yan, Nan Wu, Abdullah M Iliyasu, Kazuhiko Kawamoto, and Kaoru Hirota. 2022. Framework for identifying and visualising emotional atmosphere in online learning environments in the COVID-19 Era. Applied Intelligence 52, 8 (2022), 9406--9422.
[106]
Jian Bo Yang, Minh Nhut Nguyen, Phyo Phyo San, Xiao Li Li, and Shonali Krishnaswamy. 2015. Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition. In Proceedings of the 24th International Conference on Artificial Intelligence (Buenos Aires, Argentina) (IJCAI'15). AAAI Press, 3995--4001.
[107]
Matin Yarmand, Jaemarie Solyst, Scott Klemmer, and Nadir Weibel. 2021. "It Feels Like I Am Talking into a Void": Understanding Interaction Gaps in Synchronous Online Classrooms (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 351, 9 pages. https://doi.org/10.1145/3411764.3445240
[108]
Woo-Han Yun, Dongjin Lee, Chankyu Park, Jaehong Kim, and Junmo Kim. 2018. Automatic recognition of children engagement from facial video using convolutional neural networks. IEEE Transactions on Affective Computing 11, 4 (2018), 696--707.
[109]
Ming Zeng, Le T Nguyen, Bo Yu, Ole J Mengshoel, Jiang Zhu, Pang Wu, and Joy Zhang. 2014. Convolutional neural networks for human activity recognition using mobile sensors. In 6th international conference on mobile computing, applications and services. IEEE, 197--205.

Cited By

View all
  • (2024)Modeling Attentive Interaction Behavior for Web Content Identification in Exploratory Information SeekingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997508:4(1-28)Online publication date: 21-Nov-2024
  • (2024)Gaze-Driven Adaptive Learning System With ChatGPT-Generated SummariesIEEE Access10.1109/ACCESS.2024.350305912(173714-173733)Online publication date: 2024

Index Terms

  1. Integrating Gaze and Mouse Via Joint Cross-Attention Fusion Net for Students' Activity Recognition in E-learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 7, Issue 3
    September 2023
    1734 pages
    EISSN:2474-9567
    DOI:10.1145/3626192
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 September 2023
    Published in IMWUT Volume 7, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. activity recognition
    2. e-learning
    3. gaze
    4. mouse movement
    5. multimodal fusion

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)240
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Modeling Attentive Interaction Behavior for Web Content Identification in Exploratory Information SeekingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997508:4(1-28)Online publication date: 21-Nov-2024
    • (2024)Gaze-Driven Adaptive Learning System With ChatGPT-Generated SummariesIEEE Access10.1109/ACCESS.2024.350305912(173714-173733)Online publication date: 2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media