ABSTRACT
Engagement is the holy grail of learning whether it is in a classroom setting or an online learning platform. Studies have shown that engagement of the student while learning can benefit students as well as the teacher if the engagement level of the student is known. It is difficult to keep track of the engagement of each student in a face-to-face learning happening in a large classroom. It is even more difficult in an online learning platform where, the user is accessing the material at different instances. Automatic analysis of the engagement of students can help to better understand the state of the student in a classroom setting as well as online learning platforms and is more scalable. In this paper we propose a framework that uses Temporal Convolutional Network (TCN) to understand the intensity of engagement of students attending video material from Massive Open Online Courses (MOOCs). The input to the TCN network is the statistical features computed on 10 second segments of the video from the gaze, head pose and action unit intensities available in OpenFace library. The ability of the TCN architecture to capture long term dependencies gives it the ability to outperform other sequential models like LSTMs. On the given test set in the EmotiW 2018 sub challenge-"Engagement in the Wild", the proposed approach with Dilated-TCN achieved an average mean square error of 0.079.
- Rick D Axelson and Arend Flick. 2010. Defining student engagement. Change: The magazine of higher learning Vol. 43, 1 (2010), 38--43.Google Scholar
- Tadas Baltrušaitis, Marwa Mahmoud, and Peter Robinson. 2015. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, Vol. Vol. 6. IEEE, 1--6.Google ScholarCross Ref
- Tadas Baltrušaitis, Peter Robinson, and Louis-Philippe Morency. 2013. Constrained local neural fields for robust facial landmark detection in the wild Proceedings of the IEEE International Conference on Computer Vision Workshops. 354--361. Google ScholarDigital Library
- Tadas Baltrušaitis, Peter Robinson, and Louis-Philippe Morency. 2016. Openface: an open source facial behavior analysis toolkit Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on. IEEE, 1--10.Google Scholar
- Jonathan Bidwell and Henry Fuchs. 2011. Classroom analytics: Measuring student engagement with automated gaze tracking. Behav Res Methods Vol. 49 (2011), 113.Google Scholar
- Mathieu Chollet and Stefan Scherer. 2017. Assessing public speaking ability from thin slices of behavior 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, 310--316.Google Scholar
- Charles Darwin and Phillip Prodger. 1998. The expression of the emotions in man and animals. Oxford University Press, USA.Google Scholar
- Arjun D'Cunha, Abhay Gupta, Kamal Awasthi, and Vineeth Balasubramanian. 2016. Daisee: Towards user engagement recognition in the wild. arXiv preprint arXiv:1609.01885 (2016).Google Scholar
- Abhinav Dhall, Amanjot Kaur, Roland Goecke, and Tom Gedeon. 2018. EmotiW 2018: Audio-Video, Student Engagement and Group-Level Affect Prediction. ACM ICMI (in press) (2018). Google ScholarDigital Library
- Maria Frank, Ghassem Tofighi, Haisong Gu, and Renate Fruchter. 2016. Engagement detection in meetings. arXiv preprint arXiv:1608.08711 (2016).Google Scholar
- Daniel Gatica-Perez, L. McCowan, Dong Zhang, and Samy Bengio. 2005. Detecting group interest-level in meetings. In Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP'05). IEEE International Conference on, Vol. Vol. 1. IEEE, I--489.Google Scholar
- Amanjot Kaur, Aamir Mustafa, Love Mehta, and Abhinav Dhall. 2018. Prediction and Localization of Student Engagement in the Wild. arXiv preprint arXiv:1804.00858 (2018).Google Scholar
- Reed W. Larson and Maryse H. Richards. 1991. Boredom in the middle school years: Blaming schools versus blaming students. American journal of education Vol. 99, 4 (1991), 418--443.Google Scholar
- Daniel F. O. Onah, Jane Sinclair, and Russell Boyatt. 2014. Dropout rates of massive open online courses: behavioural patterns. EDULEARN14 proceedings (2014), 5825--5834.Google Scholar
- Jinxian Qin, Yaqian Zhou, Hong Lu, and Heqing Ya. 2015. Teaching Video Analytics Based on Student Spatial and Temporal Behavior Mining Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM, 635--642. Google ScholarDigital Library
- Mirko Raca. 2015. Camera-based estimation of student's attention in class. (2015).Google Scholar
- Prajit Ramachandran, Barret Zoph, and Quoc V Le. {n.d.}. Searching for activation functions. ({n.d.}).Google Scholar
- Colin Lea Michael D. Flynn René and Vidal Austin Reiter Gregory D. Hager. 2017. Temporal convolutional networks for action segmentation and detection IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Hanan Salam and Mohamed Chetouani. 2015. Engagement detection based on mutli-party cues for human robot interaction Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on. IEEE, 341--347. Google ScholarDigital Library
- Gale M. Sinatra, Benjamin C. Heddy, and Doug Lombardi. 2015. The challenges of defining and measuring student engagement in science. (2015).Google Scholar
- Chinchu Thomas and Dinesh Babu Jayagopi. 2017. Predicting student engagement in classrooms using facial behavioral cues Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education. ACM, 33--40. Google ScholarDigital Library
- Ghassem Tofighi, Haisong Gu, and Kaamraan Raahemifar. 2016. Vision-based engagement detection in Virtual Reality Digital Media Industry & Academic Forum (DMIAF). IEEE, 202--206.Google Scholar
- Jacob Whitehill, Zewelanji Serpell, Yi-Ching Lin, Aysha Foster, and Javier R. Movellan. 2014. The faces of engagement: Automatic recognition of student engagementfrom facial expressions. IEEE Transactions on Affective Computing Vol. 5, 1 (2014), 86--98.Google ScholarCross Ref
- Erroll et al. Wood. 2015. Rendering of eyes for eye-shape registration and gaze estimation Proceedings of the IEEE International Conference on Computer Vision. 3756--3764. Google ScholarDigital Library
Index Terms
Predicting Engagement Intensity in the Wild Using Temporal Convolutional Network
Recommendations
Student engagement in massive open online courses
Completion rates in massive open online courses MOOCs are disturbingly low. Existing analysis has focused on patterns of resource access and prediction of drop-out using learning analytics. In contrast, the effectiveness of teaching programs in ...
Promoting Student Engagement in MOOCs
L@S '16: Proceedings of the Third (2016) ACM Conference on Learning @ ScaleMOOCs offer valuable learning experiences to students from all around the world. In addition to providing filmed lectures, readings, and problem sets, many MOOCs allow students to ask and answer questions about course materials with each other through ...
Using Learning Analytics to Promote Student Engagement and Achievement in Blended Learning: An Empirical Study
ICEBT '18: Proceedings of the 2018 2nd International Conference on E-Education, E-Business and E-TechnologyThe emergence of blended learning has huge impact on traditional learning. Blended learning has its own unique characteristics combining the advantages of traditional learning and online learning. However, some problems of blended learning have also ...
Comments