research-article

Integrating Gaze and Mouse Via Joint Cross-Attention Fusion Net for Students' Activity Recognition in E-learning

Authors:

ZhongMin CaiAuthors Info & Claims

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 7, Issue 3

Article No.: 145, Pages 1 - 35

https://doi.org/10.1145/3610876

Published: 27 September 2023 Publication History

Abstract

E-learning has emerged as an indispensable educational mode in the post-epidemic era. However, this mode makes it difficult for students to stay engaged in learning without appropriate activity monitoring. Our work explores a promising solution that combines gaze and mouse data to recognize students' activities, thereby facilitating activity monitoring and analysis during e-learning. We initially surveyed 200 students from a local university, finding more acceptance for eye trackers and mouse loggers compared to video surveillance. We then designed eight students' routine digital activities to collect a multimodal dataset and analyze the patterns and correlations between gaze and mouse across various activities. Our proposed Joint Cross-Attention Fusion Net, a multimodal activity recognition framework, leverages the gaze-mouse relationship to yield improved classification performance by integrating cross-modal representations through a cross-attention mechanism and integrating the joint features that characterize gaze-mouse coordination. Evaluation results show that our method can achieve up to 94.87% F1-score in predicting 8-classes activities, with an improvement of at least 7.44% over using gaze or mouse data independently. This research illuminates new possibilities for monitoring student engagement in intelligent education systems, also suggesting a promising strategy for melding perception and action modalities in behavioral analysis across a range of ubiquitous computing environments.

References

[1]

Yomna Abdelrahman, Anam Ahmad Khan, Joshua Newn, Eduardo Velloso, Sherine Ashraf Safwat, James Bailey, Andreas Bulling, Frank Vetere, and Albrecht Schmidt. 2019. Classifying attention types with thermal imaging and eye tracking. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 1--27.

Digital Library

[2]

Karan Ahuja, Deval Shah, Sujeath Pareddy, Franceska Xhakaj, Amy Ogan, Yuvraj Agarwal, and Chris Harrison. 2021. Classroom Digital Twins with Instrumentation-Free Gaze Tracking (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 484, 9 pages. https://doi.org/10.1145/3411764.3445711

Digital Library

[3]

Dimah Al-Fraihat, Mike Joy, Jane Sinclair, et al. 2020. Evaluating E-learning systems success: An empirical study. Computers in human behavior 102 (2020), 67--86.

[4]

Hans-Joachim Bieg, Lewis L Chuang, Roland W Fleming, Harald Reiterer, and Heinrich H Bülthoff. 2010. Eye and pointer coordination in search and selection tasks. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications. 89--92.

Digital Library

[5]

B Biguer, Marc Jeannerod, and C Prablanc. 1982. The coordination of eye, head, and arm movements during reaching at a single visual target. Experimental brain research 46, 2 (1982), 301--304.

[6]

Gordon Binsted, Romeo Chua, Werner Helsen, and Digby Elliott. 2001. Eye--hand coordination in goal-directed aiming. Human movement science 20, 4-5 (2001), 563--585.

[7]

Andreas Bulling and Daniel Roggen. 2011. Recognition of visual memory recall processes using eye movement analysis. In Proceedings of the 13th international conference on Ubiquitous computing. 455--464.

Digital Library

[8]

Andreas Bulling, Daniel Roggen, and Gerhard Troester. 2011. What's in the Eyes for Context-Awareness? IEEE Pervasive Computing 10, 2 (2011), 48--57. https://doi.org/10.1109/MPRV.2010.49

Digital Library

[9]

Andreas Bulling, Jamie A Ward, Hans Gellersen, and Gerhard Tröster. 2009. Eye movement analysis for activity recognition. In Proceedings of the 11th international conference on Ubiquitous computing. 41--50.

Digital Library

[10]

Andreas Bulling, Jamie A Ward, Hans Gellersen, and Gerhard Tröster. 2010. Eye movement analysis for activity recognition using electrooculography. IEEE transactions on pattern analysis and machine intelligence 33, 4 (2010), 741--753.

Digital Library

[11]

Cátia Cepeda, Maria Camila Dias, Dina Rindlisbacher, Marcus Cheetham, and Hugo Gamboa. 2020. Eye-pointer Coordination in a Decision-making Task Under Uncertainty. In BIOSIGNALS. 30--37.

[12]

Kaixuan Chen, Dalin Zhang, Lina Yao, Bin Guo, Zhiwen Yu, and Yunhao Liu. 2021. Deep Learning for Sensor-Based Human Activity Recognition: Overview, Challenges, and Opportunities. ACM Comput. Surv. 54, 4, Article 77 (may 2021), 40 pages. https://doi.org/10.1145/3447744

Digital Library

[13]

Liming Chen, Chris D. Nugent, and Hui Wang. 2012. A knowledge-driven approach to activity recognition in smart homes. IEEE Transactions on Knowledge and Data Engineering 24, 6 (2012), 961--974. https://doi.org/10.1109/TKDE.2011.51

Digital Library

[14]

Mon Chu Chen, John R Anderson, and Myeong Ho Sohn. 2001. What can a mouse cursor tell us more? Correlation of eye/mouse movements on web browsing. In CHI'01 extended abstracts on Human factors in computing systems. 281--282.

Digital Library

[15]

Zhilong Chen, Hancheng Cao, Yuting Deng, Xuan Gao, Jinghua Piao, Fengli Xu, Yu Zhang, and Yong Li. 2021. Learning from home: A mixed-methods analysis of live streaming based remote education experience in Chinese colleges during the COVID-19 pandemic. In Proceedings of the 2021 CHI Conference on human factors in computing systems. 1--16.

Digital Library

[16]

Mihaela Cocea and Stephan Weibelzahl. 2010. Disengagement detection in online learning: Validation studies and perspectives. IEEE transactions on learning technologies 4, 2 (2010), 114--124.

[17]

Franois Courtemanche, Esma Aïmeur, Aude Dufresne, Mehdi Najjar, and Franck Mpondo. 2011. Activity recognition using eye-gaze movements and traditional interactions. Interacting with Computers 23 (2011), 202--213. Issue 3. https://doi.org/10.1016/j.intcom.2011.02.008

Digital Library

[18]

Javier De Lope and Manuel Grana. 2020. Behavioral activity recognition based on gaze ethograms. International Journal of Neural Systems 30, 07 (2020), 2050025.

[19]

Shujie Deng, Jian Chang, Julie A Kirkby, and Jian J Zhang. 2016. Gaze--mouse coordinated movements and dependency with coordination demands in tracing. Behaviour & Information Technology 35, 8 (2016), 665--679.

Digital Library

[20]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[21]

M. Ali Akber Dewan, Mahbub Murshed, and Fuhua Lin. 2019. Engagement detection in online learning: a review. Smart Learning Environments 6 (12 2019), 1. Issue 1. https://doi.org/10.1186/s40561-018-0080-z

[22]

Elena Di Lascio, Shkurta Gashi, Maike E. Debus, and Silvia Santini. 2021. Automatic Recognition of Flow During Work Activities Using Context and Physiological Signals. In 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII). 1--8. https://doi.org/10.1109/ACII52823.2021.9597434

[23]

Elena Di Lascio, Shkurta Gashi, Juan Sebastian Hidalgo, Beatrice Nale, Maike E Debus, and Silvia Santini. 2020. A multi-sensor approach to automatically recognize breaks and work activities of knowledge workers in academia. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1--20.

Digital Library

[24]

Elena Di Lascio, Shkurta Gashi, and Silvia Santini. 2018. Unobtrusive Assessment of Students' Emotional Engagement during Lectures Using Electrodermal Activity Sensors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3 (2018), 1--21. https://doi.org/10.1145/3264913

Digital Library

[25]

He Du, Zhiwen Yu, Dong Xiao, Zhu Wang, Qi Han, and Bin Guo. 2017. Sensing keyboard input for computer activity recognition with a smartphone. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers. 25--28.

Digital Library

[26]

Andrew T Duchowski, Krzysztof Krejtz, Izabela Krejtz, Cezary Biele, Anna Niedzielska, Peter Kiefer, Martin Raubal, and Ioannis Giannopoulos. 2018. The index of pupillary activity: Measuring cognitive load vis-à-vis task difficulty with pupil oscillation. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1--13.

Digital Library

[27]

Selina N Emhardt, Halszka Jarodzka, Saskia Brand-Gruwel, Christian Drumm, Diederick C Niehorster, and Tamara van Gog. 2022. What is my teacher talking about? Effects of displaying the teacher's gaze and mouse cursor cues in video lectures on students' learning. Journal of Cognitive Psychology 34, 7 (2022), 846--864.

[28]

Wolfgang Fuhl, Nikolai Sanamrad, and Enkelejda Kasneci. 2021. The gaze and mouse signal as additional source for user fingerprints in browser applications. arXiv preprint arXiv:2101.03793 (2021).

[29]

Chenyang Gao, Ivan Marsic, Aleksandra Sarcevic, Waverly Gestrich-Thompson, and Randall S Burd. 2023. Real-time Context-Aware Multimodal Network for Activity and Activity-Stage Recognition from Team Communication in Dynamic Clinical Settings. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 1 (2023), 1--28.

Digital Library

[30]

Nan Gao, Wei Shao, Mohammad Saiedur Rahaman, and Flora D. Salim. 2020. N-Gage: Predicting in-Class Emotional, Behavioural and Cognitive Engagement in the Wild. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 3, Article 79 (sep 2020), 26 pages. https://doi.org/10.1145/3411813

Digital Library

[31]

K. Gidlf, A. Wallin, R. Dewhurst, and K. Holmqvist. 2013. Using Eye Tracking to Trace a Cognitive Process: Gaze Behaviour During Decision Making in a Natural Environment. Journal of Eye Movement Research 6, 1 (2013), 613--619.

[32]

Benjamin S Goldberg, Robert A Sottilare, Keith W Brawner, and Heather K Holden. 2011. Predicting learner engagement during well-defined and ill-defined computer-based intercultural interactions. In International Conference on Affective Computing and Intelligent Interaction. Springer, 538--547.

[33]

Qi Guo and Eugene Agichtein. 2010. Towards Predicting Web Searcher Gaze Position from Mouse Movements. In CHI '10 Extended Abstracts on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI EA '10). Association for Computing Machinery, New York, NY, USA, 3601--3606. https://doi.org/10.1145/1753846.1754025

Digital Library

[34]

Mariam Hassib, Stefan Schneegass, Philipp Eiglsperger, Niels Henze, Albrecht Schmidt, and Florian Alt. 2017. EngageMeter: A System for Implicit Audience Engagement Sensing Using Electroencephalography. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). Association for Computing Machinery, New York, NY, USA, 5114--5119. https://doi.org/10.1145/3025453.3025669

Digital Library

[35]

Chengkun He, Jie Shao, and Jiayu Sun. 2018. An anomaly-introduced learning method for abnormal event detection. Multimedia Tools and Applications 77, 22 (2018), 29573--29588.

Digital Library

[36]

Fengxiang He, Tongliang Liu, and Dacheng Tao. 2020. Why resnet works? residuals generalize. IEEE transactions on neural networks and learning systems 31, 12 (2020), 5349--5362.

[37]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[38]

Curtis R. Henrie, Robert Bodily, Ross Larsen, and Charles R. Graham. 2018. Exploring the potential of LMS log data as a proxy measure of student engagement. Journal of Computing in Higher Education 30 (8 2018), 344--362. Issue 2. https://doi.org/10.1007/s12528-017-9161-1

[39]

Brad Hodge, Brad Wright, and Pauleen Bennett. 2018. The role of grit in determining engagement and academic outcomes for university students. Research in Higher Education 59, 4 (2018), 448--460.

[40]

Jeff Huang, Ryen White, and Georg Buscher. 2012. User see, user pointGaze and Cursor Alignment in Web Search. In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI '12. ACM Press, New York, New York, USA, 1341. https://doi.org/10.1145/2207676.2208591

Digital Library

[41]

Jeff Huang, Ryen W White, and Susan Dumais. 2011. No clicks, no problem: using cursor movements to understand and improve search. In Proceedings of the SIGCHI conference on human factors in computing systems. 1225--1234.

Digital Library

[42]

Stephen Hutt, Kristina Krasich, James R. Brockmole, and Sidney K. D'Mello. 2021. Breaking out of the Lab: Mitigating Mind Wandering with Gaze-Based Attention-Aware Technology in Classrooms. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 52, 14 pages. https://doi.org/10.1145/3411764.3445269

Digital Library

[43]

Stephen Hutt, Caitlin Mills, Nigel Bosch, Kristina Krasich, James Brockmole, and Sidney D'mello. 2017. "Out of the Fr-Eye-ing Pan" Towards Gaze-Based Models of Attention during Learning with Technology in the Classroom. In Proceedings of the 25th conference on user modeling, adaptation and personalization. 94--103.

Digital Library

[44]

Md Rabiul Islam, Shuji Sakamoto, Yoshihiro Yamada, Andrew W Vargo, Motoi Iwata, Masakazu Iwamura, and Koichi Kise. 2021. Self-supervised Learning for Reading Activity Classification. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 3 (2021), 1--22.

Digital Library

[45]

Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review. Data mining and knowledge discovery 33, 4 (2019), 917--963.

Digital Library

[46]

Roland S Johansson, Göran Westling, Anders Bäckström, and J Randall Flanagan. 2001. Eye--hand coordination in object manipulation. Journal of neuroscience 21, 17 (2001), 6917--6932.

[47]

Harmanpreet Kaur, Alex C. Williams, Daniel McDuff, Mary Czerwinski, Jaime Teevan, and Shamsi T. Iqbal. 2020. Optimizing for Happiness and Productivity: Modeling Opportune Moments for Transitions and Breaks at Work. Conference on Human Factors in Computing Systems - Proceedings (2020), 1--15. https://doi.org/10.1145/3313831.3376817

Digital Library

[48]

Ilan Kirsh. 2020. Directions and Speeds of Mouse Movements on a Website and Reading Patterns: A Web Usage Mining Case Study. In Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics (Biarritz, France) (WIMS 2020). Association for Computing Machinery, New York, NY, USA, 129--138. https://doi.org/10.1145/3405962.3405982

Digital Library

[49]

Ilan Kirsh and Mike Joy. 2020. Exploring Pointer Assisted Reading (PAR): Using Mouse Movements to Analyze Web Users' Reading Behaviors and Patterns. In HCI International 2020 - Late Breaking Papers: Multimodality and Intelligence, Constantine Stephanidis, Masaaki Kurosu, Helmut Degen, and Lauren Reinerman-Jones (Eds.). Springer International Publishing, Cham, 156--173.

[50]

Saskia Koldijk, Mark van Staalduinen, Mark Neerincx, and Wessel Kraaij. 2012. Real-Time Task Recognition Based on Knowledge Workers' Computer Activities. In Proceedings of the 30th European Conference on Cognitive Ergonomics (Edinburgh, United Kingdom) (ECCE '12). Association for Computing Machinery, New York, NY, USA, 152--159. https://doi.org/10.1145/2448136.2448170

Digital Library

[51]

Thomas Kosch, Mariam Hassib, Paweł W Woźniak, Daniel Buschek, and Florian Alt. 2018. Your eyes tell: Leveraging smooth pursuit for assessing cognitive workload. In Proceedings of the 2018 chi conference on human factors in computing systems. 1--13.

Digital Library

[52]

Manu Kumar, Jeff Klingner, Rohan Puranik, Terry Winograd, and Andreas Paepcke. 2008. Improving the accuracy of gaze input for interaction. In Proceedings of the 2008 symposium on Eye tracking research & applications. 65--68.

Digital Library

[53]

Kai Kunze, Yuzuko Utsumi, Yuki Shiga, Koichi Kise, and Andreas Bulling. 2013. I know what you are reading - recognition of document types using mobile eye tracking. ISWC 2013 - Proceedings of the 2013 ACM International Symposium on Wearable Computers (2013), 113--116. https://doi.org/10.1145/2493988.2494354

Digital Library

[54]

Tiffany C K Kwok, Eugene Yujun Fu, Erin You Wu, Michael Xuelin Huang, Grace Ngai, and Hong-Va Leong. 2018. Every Little Movement Has a Meaning of Its Own: Using Past Mouse Movements to Predict the Next Interaction. In 23rd International Conference on Intelligent User Interfaces. 397--401.

Digital Library

[55]

Guohao Lan, Bailey Heit, Tim Scargill, and Maria Gorlatova. 2020. GazeGraph: Graph-based few-shot cognitive context sensing from human visual behavior. SenSys 2020 - Proceedings of the 2020 18th ACM Conference on Embedded Networked Sensor Systems (2020), 422--435. https://doi.org/10.1145/3384419.3430774

Digital Library

[56]

Jiajia Li, Grace Ngai, Hong Va Leong, and Stephen CF Chan. 2016. Multimodal human attention detection for reading from facial expression, eye gaze, and mouse dynamics. ACM SIGAPP Applied Computing Review 16, 3 (2016), 37--49.

Digital Library

[57]

Jonathan Liebers, Patrick Horn, Christian Burschik, Uwe Gruenefeld, and Stefan Schneegass. 2021. Using gaze behavior and head orientation for implicit identification in virtual reality. In Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology. 1--9.

Digital Library

[58]

Daniel J. Liebling and Susan T. Dumais. 2014. Gaze and Mouse Coordination in Everyday Work (UbiComp '14 Adjunct). Association for Computing Machinery, New York, NY, USA, 1141--1150. https://doi.org/10.1145/2638728.2641692

Digital Library

[59]

Shengzhong Liu, Shuochao Yao, Jinyang Li, Dongxin Liu, Tianshi Wang, Huajie Shao, and Tarek Abdelzaher. 2020. GlobalFusion: A global attentional deep learning framework for multisensor information fusion. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1--27. https://doi.org/10.1145/3380999

Digital Library

[60]

Christof Lutteroth, Moiz Penkar, and Gerald Weber. 2015. Gaze vs. mouse: A fast and accurate gaze-only click alternative. In Proceedings of the 28th annual ACM symposium on user interface software & technology. 385--394.

Digital Library

[61]

Mathias N. Lystbæk, Peter Rosenberg, Ken Pfeuffer, Jens Emil Grønbæk, and Hans Gellersen. 2022. Gaze-Hand Alignment: Combining Eye Gaze and Mid-Air Pointing for Interacting with Menus in Augmented Reality. Proc. ACM Hum.-Comput. Interact. 6, ETRA, Article 145 (may 2022), 18 pages. https://doi.org/10.1145/3530886

Digital Library

[62]

Shuai Ma, Taichang Zhou, Fei Nie, and Xiaojuan Ma. 2022. Glancee: An Adaptable System for Instructors to Grasp Student Learning Status in Synchronous Online Classes. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI '22). Association for Computing Machinery, New York, NY, USA, Article 313, 25 pages. https://doi.org/10.1145/3491102.3517482

Digital Library

[63]

Abdelsalam M Maatuk, Ebitisam K Elberkawi, Shadi Aljawarneh, Hasan Rashaideh, and Hadeel Alharbi. 2022. The COVID-19 pandemic and E-learning: challenges and opportunities from the perspective of students and instructors. 34 (2022), 21--38. https://doi.org/10.1007/s12528-021-09274-2

[64]

Gloria Mark, Shamsi T. Iqbal, Mary Czerwinski, and Paul Johns. 2014. Bored Mondays and Focused Afternoons: The Rhythm of Attention and Online Activity in the Workplace. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Toronto, Ontario, Canada) (CHI '14). Association for Computing Machinery, New York, NY, USA, 3025--3034. https://doi.org/10.1145/2556288.2557204

Digital Library

[65]

André N Meyer, Laura E Barton, Gail C Murphy, Thomas Zimmermann, and Thomas Fritz. 2017. The work life of developers: Activities, switches and perceived productivity. IEEE Transactions on Software Engineering 43, 12 (2017), 1178--1193.

Digital Library

[66]

Andre N. Meyer, Chris Satterfield, Manuela Zuger, Katja Kevic, Gail C. Murphy, Thomas Zimmermann, and Thomas Fritz. 2020. Detecting Developers' Task Switches and Types. IEEE Transactions on Software Engineering (2020), 1--1. https://doi.org/10.1109/TSE.2020.2984086

Digital Library

[67]

Johannes Meyer, Adrian Frank, Thomas Schlebusch, and Enkeljeda Kasneci. 2021. A CNN-Based Human Activity Recognition System Combining a Laser Feedback Interferometry Eye Movement Sensor and an IMU for Context-Aware Smart Glasses. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 4 (2021), 1--24.

Digital Library

[68]

Alexandre Milisavljevic, Kevin Hamard, Coralie Petermann, Bernard Gosselin, Karine Doré-Mazars, and Matei Mancas. 2018. Eye and mouse coordination during task: from behaviour to prediction. In International Conference on Human Computer Interaction Theory and Applications. SCITEPRESS-Science and Technology Publications, 86--93.

[69]

Hamed Monkaresi, Nigel Bosch, Rafael A Calvo, and Sidney K D'Mello. 2016. Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Transactions on Affective Computing 8, 1 (2016), 15--28.

Digital Library

[70]

Prasanth Murali, Javier Hernandez, Daniel McDuff, Kael Rowan, Jina Suh, and Mary Czerwinski. 2021. AffectiveSpotlight: Facilitating the Communication of Affective Responses from Audience Members during Online Presentations (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 247, 13 pages. https://doi.org/10.1145/3411764.3445235

Digital Library

[71]

Vidhya Navalpakkam, LaDawn Jentzsch, Rory Sayres, Sujith Ravi, Amr Ahmed, and Alex Smola. 2013. Measurement and modeling of eye-mouse behavior in the presence of nonlinear page layouts. In Proceedings of the 22nd international conference on World Wide Web. 953--964.

Digital Library

[72]

Sebastiaan FW Neggers and Harold Bekkering. 2000. Ocular gaze is anchored to the target of an ongoing pointing movement. Journal of Neurophysiology 83, 2 (2000), 639--651.

[73]

Farzan Majeed Noori, Michael Riegler, Md Zia Uddin, and Jim Torresen. 2020. Human Activity Recognition from Multiple Sensors Data Using Multi-Fusion Representations and CNNs. ACM Trans. Multimedia Comput. Commun. Appl. 16, 2, Article 45 (may 2020), 19 pages. https://doi.org/10.1145/3377882

Digital Library

[74]

Henry Friday Nweke, Ying Wah Teh, Ghulam Mujtaba, and Mohammed Ali Al-Garadi. 2019. Data fusion and multiple classifier systems for human activity detection and health monitoring: Review and open research directions. Information Fusion 46 (2019), 147--170.

Digital Library

[75]

Chakradhar Pabba and Praveen Kumar. 2022. An intelligent system for monitoring students' engagement in large classroom teaching through facial expression recognition. Expert Systems 39, 1 (2022), e12839.

[76]

Erfan Pakdamanian, Shili Sheng, Sonia Baee, Seongkook Heo, Sarit Kraus, and Lu Feng. 2021. DeepTake: Prediction of Driver Takeover Behavior Using Multimodal Data. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 103, 14 pages. https://doi.org/10.1145/3411764.3445563

Digital Library

[77]

Aditya Prakash, Kashyap Chitta, and Andreas Geiger. 2021. Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7073--7083. https://doi.org/10.1109/CVPR46437.2021.00700

[78]

R Gnana Praveen, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro L Koerich, Simon Bacon, Patrick Cardinal, et al. 2022. A joint cross-attention model for audio-visual fusion in dimensional emotion recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2486--2495.

[79]

Valentin Radu, Catherine Tong, Sourav Bhattacharya, Nicholas D. Lane, Cecilia Mascolo, Mahesh K. Marina, and Fahim Kawsar. 2018. Multimodal Deep Learning for Activity and Context Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 4, Article 157 (jan 2018), 27 pages. https://doi.org/10.1145/3161174

Digital Library

[80]

Rachid Riad Saboundji and Robert Adrian Rill. 2020. Predicting Human Errors from Gaze and Cursor Movements. Proceedings of the International Joint Conference on Neural Networks (jul 2020), 1--8. https://doi.org/10.1109/IJCNN48605.2020.9207189

[81]

Thomais Rousoulioti, Dina Tsagari, and Christina Nicole Giannikas. 2022. Parents' New Role and Needs During the COVID-19 Educational Emergency. Interchange (2022), 1--27.

[82]

Mohamed Sathik and Sofia G Jonathan. 2013. Effect of facial expressions on student's comprehension recognition in virtual educational environments. SpringerPlus 2, 1 (2013), 1--9.

[83]

Carlos H Schenck, Scott R Bundlie, Andrea L Patterson, and Mark W Mahowald. 1987. Rapid eye movement sleep behavior disorder: a treatable parasomnia affecting older adults. Jama 257, 13 (1987), 1786--1789.

[84]

Chao Shen, Zhongmin Cai, Xiaohong Guan, Youtian Du, and Roy A Maxion. 2012. User authentication through mouse dynamics. IEEE Transactions on Information Forensics and Security 8, 1 (2012), 16--30.

Digital Library

[85]

Chao Shen, Zhongmin Cai, Xiaomei Liu, Xiaohong Guan, and Roy A Maxion. 2016. MouseIdentity: Modeling mouse-interaction behavior for a user verification system. IEEE Transactions on Human-Machine Systems 46, 5 (2016), 734--748.

[86]

Joo-Hyun Song and Ken Nakayama. 2009. Hidden cognitive states revealed in choice reaching tasks. Trends in cognitive sciences 13, 8 (2009), 360--366.

[87]

Yunpeng Song and Zhongmin Cai. 2022. Integrating Handcrafted Features with Deep Representations for Smartphone Authentication. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1 (2022), 1--27.

Digital Library

[88]

Namrata Srivastava, Joshua Newn, and Eduardo Velloso. 2018. Combining low and mid-level gaze features for desktop activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1--27.

Digital Library

[89]

Ben Steichen, Cristina Conati, and Giuseppe Carenini. 2014. Inferring visualization task properties, user performance, and user cognitive abilities from eye gaze data. ACM Transactions on Interactive Intelligent Systems 4, 2 (2014). https://doi.org/10.1145/2633043

Digital Library

[90]

David Stromback, Sangxia Huang, and Valentin Radu. 2020. Mm-fit Multimodal deep learning for automatic exercise logging across sensing devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4 (2020). Issue 4. https://doi.org/10.1145/3432701

Digital Library

[91]

Han Su, Shuncheng Liu, Bolong Zheng, Xiaofang Zhou, and Kai Zheng. 2020. A survey of trajectory distance measures and performance evaluation. The VLDB Journal 29 (2020), 3--32.

Digital Library

[92]

Wei Sun, Yunzhi Li, Feng Tian, Xiangmin Fan, and Hongan Wang. 2019. How Presenters Perceive and React to Audience Flow Prediction In-Situ: An Explorative Study of Live Online Lectures. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 162 (nov 2019), 19 pages. https://doi.org/10.1145/3359264

Digital Library

[93]

Praveen Sundar and S Kumar. 2016. Disengagement detection in online learning using log file analysis. International journal of computer technology and applications 9, 27 (2016), 195--301.

[94]

Luke Sy, Michael Raitor, Michael Del Rosario, Heba Khamis, Lauren Kark, Nigel H Lovell, and Stephen J Redmond. 2020. Estimating lower limb kinematics using a reduced wearable sensor count. IEEE Transactions on Biomedical Engineering 68, 4 (2020), 1293--1304.

[95]

Daniel Szafir and Bilge Mutlu. 2012. Pay Attention! Designing Adaptive Agents That Monitor and Improve User Engagement. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). Association for Computing Machinery, New York, NY, USA, 11--20. https://doi.org/10.1145/2207676.2207679

Digital Library

[96]

Yi Tay, Mostafa Dehghani, Dara Bahri, and Donald Metzler. 2022. Efficient transformers: A survey. Comput. Surveys 55, 6 (2022), 1--28.

Digital Library

[97]

Jia-Hua Tsai and Wei-Ta Chu. 2022. Multimodal Fusion with Cross-Modal Attention for Action Recognition in Still Images. In Proceedings of the 4th ACM International Conference on Multimedia in Asia (Tokyo, Japan) (MMAsia '22). Association for Computing Machinery, New York, NY, USA, Article 31, 5 pages. https://doi.org/10.1145/3551626.3564960

Digital Library

[98]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[99]

Jun Wang, Michael Xuelin Huang, Grace Ngai, and Hong Va Leong. 2017. Are you stressed? Your eyes and the mouse can tell. In 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 222--228.

[100]

Datong Wei, Chaofan Yang, Xiaolong (Luke) Zhang, and Xiaoru Yuan. 2021. Predicting Mouse Click Position Using Long Short-Term Memory Model Trained by Joint Loss Function. Conference on Human Factors in Computing Systems - Proceedings (2021). https://doi.org/10.1145/3411763.3451651

Digital Library

[101]

Huan Wei, Haotian Li, Meng Xia, Yong Wang, and Huamin Qu. 2020. Predicting student performance in interactive online question pools using mouse interaction features. In Proceedings of the tenth international conference on learning analytics & knowledge. 645--654.

Digital Library

[102]

Jacob Whitehill, Zewelanji Serpell, Yi-Ching Lin, Aysha Foster, and Javier R Movellan. 2014. The faces of engagement: Automatic recognition of student engagementfrom facial expressions. IEEE Transactions on Affective Computing 5, 1 (2014), 86--98.

[103]

Yang Xian, Xuejian Rong, Xiaodong Yang, and Yingli Tian. 2016. Evaluation of low-level features for real-world surveillance event detection. IEEE Transactions on Circuits and Systems for Video Technology 27, 3 (2016), 624--634.

Digital Library

[104]

Pingmei Xu, Yusuke Sugano, and Andreas Bulling. 2016. Spatio-temporal modeling and prediction of visual attention in graphical user interfaces. Conference on Human Factors in Computing Systems - Proceedings (2016), 3299--3310. https://doi.org/10.1145/2858036.2858479

Digital Library

[105]

Fei Yan, Nan Wu, Abdullah M Iliyasu, Kazuhiko Kawamoto, and Kaoru Hirota. 2022. Framework for identifying and visualising emotional atmosphere in online learning environments in the COVID-19 Era. Applied Intelligence 52, 8 (2022), 9406--9422.

Digital Library

[106]

Jian Bo Yang, Minh Nhut Nguyen, Phyo Phyo San, Xiao Li Li, and Shonali Krishnaswamy. 2015. Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition. In Proceedings of the 24th International Conference on Artificial Intelligence (Buenos Aires, Argentina) (IJCAI'15). AAAI Press, 3995--4001.

Digital Library

[107]

Matin Yarmand, Jaemarie Solyst, Scott Klemmer, and Nadir Weibel. 2021. "It Feels Like I Am Talking into a Void": Understanding Interaction Gaps in Synchronous Online Classrooms (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 351, 9 pages. https://doi.org/10.1145/3411764.3445240

Digital Library

[108]

Woo-Han Yun, Dongjin Lee, Chankyu Park, Jaehong Kim, and Junmo Kim. 2018. Automatic recognition of children engagement from facial video using convolutional neural networks. IEEE Transactions on Affective Computing 11, 4 (2018), 696--707.

Digital Library

[109]

Ming Zeng, Le T Nguyen, Bo Yu, Ole J Mengshoel, Jiang Zhu, Pang Wu, and Joy Zhang. 2014. Convolutional neural networks for human activity recognition using mobile sensors. In 6th international conference on mobile computing, applications and services. IEEE, 197--205.

Cited By

Zhu RCheng CSong YCai Z(2024)Modeling Attentive Interaction Behavior for Web Content Identification in Exploratory Information SeekingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997508:4(1-28)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699750
Santhosh JDengel AIshimaru S(2024)Gaze-Driven Adaptive Learning System With ChatGPT-Generated SummariesIEEE Access10.1109/ACCESS.2024.350305912(173714-173733)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3503059

Index Terms

Integrating Gaze and Mouse Via Joint Cross-Attention Fusion Net for Students' Activity Recognition in E-learning
1. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing systems and tools

Recommendations

Towards activity recognition of learners in on-line lecture

Understanding the states of learners at a lecture is useful for improving the quality of the lecture. A video camera with an infrared sensor Kinect has been widely studied and proved to be useful for some kinds of activity recognition. However, learners ...
Fusing Dialogue and Gaze From Discussions of 2D and 3D Scenes
ICMI '19: Adjunct of the 2019 International Conference on Multimodal Interaction

Conversation partners rely on inference using each other’s gaze and utterances to negotiate shared meaning. In contrast, dialogue systems still operate mostly with unimodal question or command and response interactions. To realize systems that can ...
Smart Eyewear for Interaction and Activity Recognition
CHI EA '15: Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems

vice class with a lot of possibilities for user interac- tion design and unobtrusive activity tracking. In this paper we show applications using an early prototype of J!NS MEME, smart glasses with integrated electrodes to detect eye movements (...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 7, Issue 3

September 2023

1734 pages

EISSN:2474-9567

DOI:10.1145/3626192

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 September 2023

Published in IMWUT Volume 7, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
448
Total Downloads

Downloads (Last 12 months)240
Downloads (Last 6 weeks)17

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhu RCheng CSong YCai Z(2024)Modeling Attentive Interaction Behavior for Web Content Identification in Exploratory Information SeekingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997508:4(1-28)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699750
Santhosh JDengel AIshimaru S(2024)Gaze-Driven Adaptive Learning System With ChatGPT-Generated SummariesIEEE Access10.1109/ACCESS.2024.350305912(173714-173733)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3503059

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents