ABSTRACT
Learning management systems provide a wide breadth of data waiting to be analyzed and utilized to enhance student and faculty experience in higher education. As universities struggle to support students’ engagement, success and retention, learning analytics is being used to build predictive models and develop dashboards to support learners and help them stay engaged, to help teachers identify students needing support, and to predict and prevent dropout. Learning with Big Data has its challenges, however: managing great quantities of data requires time and expertise. To predict students at risk, many institutions use machine learning algorithms with LMS data for a given course or type of course, but only a few are trying to make predictions for a large subset of courses. This begs the question: “How can student dropout be predicted on a very large set of courses in an institution Moodle LMS?” In this paper, we use automation to improve student dropout prediction for a very large subset of courses, by clustering them based on course design and similarity, then by automatically training, testing, and selecting machine learning algorithms for each cluster. We developed a promising methodology that outlines a basic framework that can be adjusted and optimized in many ways and that further studies can easily build on and improve.
- Kimberly E. Arnold and Matthew D. Pistilli. 2012. Course signals at Purdue: Using learning analytics to increase student success. In Proceedings of the 2nd international conference on learning analytics and knowledge, 267–270.Google Scholar
- Matthew Berland, Ryan S. Baker, and Paulo Blikstein. 2014. Educational data mining and learning analytics: Applications to constructionist research. Technology, Knowledge and Learning 19, 1 (2014), 205–220.Google ScholarCross Ref
- Tadeusz Caliński and Jerzy Harabasz. 1974. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods 3, 1 (1974), 1–27.Google ScholarCross Ref
- Michel Desmarais and François Lemieux. 2013. Clustering and visualizing study state sequences. In Educational Data Mining 2013.Google Scholar
- Xu Du, Juan Yang, Brett E. Shelton, Jui-Long Hung, and Mingyan Zhang. 2021. A systematic meta-review and analysis of learning analytics research. Behaviour & information technology 40, 1 (2021), 49–62.Google Scholar
- Jennifer A Fredricks, Phyllis C Blumenfeld, and Alison H Paris. 2004. School Engagement: Potential of the Concept, State of the Evidence. Review of Educational Research 74, 1 (March 2004), 59–109. DOI:https://doi.org/10.3102/00346543074001059Google ScholarCross Ref
- Yamini Goel and Rinkaj Goyal. 2020. On the effectiveness of self-training in mooc dropout prediction. Open Computer Science 10, 1 (2020), 246–258.Google ScholarCross Ref
- Dirk Ifenthaler and Jane Yin-Kim Yau. 2020. Utilising learning analytics to support study success in higher education: a systematic review. Educational Technology Research and Development 68, 4 (2020), 1961–1990.Google ScholarCross Ref
- Sandeep M. Jayaprakash, Erik W. Moody, Eitel JM Lauría, James R. Regan, and Joshua D. Baron. 2014. Early alert of academically at-risk students: An open source analytics initiative. Journal of Learning Analytics 1, 1 (2014), 6–47.Google ScholarCross Ref
- Dr Nicole Johnson. 2021. 2021 National Report: Lessons from the COVID-19 pandemic. (2021), 21.Google Scholar
- Gaëlle Molinari, Bruno Poellhuber, Jean Heutte, Elise Lavoué, Denise Sutter Widmer, and Pierre-André Caron. 2016. L'engagement et la persistance dans les dispositifs de formation en ligne: regards croisés. Distances et médiations des savoirs. Distance and Mediation of Knowledge 13 (2016).Google Scholar
- Rudolph S. Parrish and Horace J. Spencer III. 2004. Effect of normalization on significance testing for oligonucleotide microarrays. Journal of biopharmaceutical statistics 14, 3 (2004), 575–589.Google ScholarCross Ref
- Bruno Poellhuber, Normand Roy, and Ibtihel Bouchoucha. 2019. Understanding participant's behaviour in massively open online courses. International Review of Research in Open and Distributed Learning 20, 1 (2019).Google ScholarCross Ref
- Anita Rácz, Dávid Bajusz, and Károly Héberger. 2021. Effect of dataset size and train/test split ratios in QSAR/QSPR multiclass classification. Molecules 26, 4 (2021), 1111.Google ScholarCross Ref
- Silhouettes Rousseeuw. A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20 , 53.Google Scholar
- Usha Ruby and Vamsidhar Yendapalli. 2020. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng 9, 10 (2020).Google Scholar
- Niall Sclater, Alice Peasgood, and Joel Mullan. 2016. Learning analytics in higher education. London: Jisc. Accessed February 8, 2017 (2016), 176.Google Scholar
- Ketan Rajshekhar Shahapure and Charles Nicholas. 2020. Cluster quality analysis using silhouette score. In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), IEEE, 747–748.Google ScholarCross Ref
- George Siemens, Dragan Gasevic, Caroline Haythornthwaite, Shane Dawson, S. Buckingham Shum, Rebecca Ferguson, Erik Duval, Katrien Verbert, and RSJD Baker. 2011. Open Learning Analytics: an integrated & modularized platform. In Proceedings of the 2Nd International Conference on Learning Analytics and Knowledge, Open University Press Maidenhead, Vancouver, 252–254.Google Scholar
- Jimin Tan, Jianan Yang, Sai Wu, Gang Chen, and Jake Zhao. 2021. A critical look at the current train/test split in machine learning. arXiv preprint arXiv:2106.04525 (2021).Google Scholar
- Wei Wang, Han Yu, and Chunyan Miao. 2017. Deep model for dropout prediction in MOOCs. In Proceedings of the 2nd international conference on crowd science and engineering, 26–32.Google ScholarDigital Library
Index Terms
- Cluster-Based Performance of Student Dropout Prediction as a Solution for Large Scale Models in a Moodle LMS
Recommendations
MOOC Dropout Prediction: How to Measure Accuracy?
L@S '17: Proceedings of the Fourth (2017) ACM Conference on Learning @ ScaleIn order to obtain reliable accuracy estimates for automatic MOOC dropout predictors, it is important to train and test them in a manner consistent with how they will be used in practice. Yet most prior research on MOOC dropout prediction has measured ...
Evaluation on Moodle LMS Data Usage During the First Wave of Covid-19’s Pandemic
Universal Access in Human-Computer Interaction. Access to Media, Learning and Assistive EnvironmentsAbstractAs the need for remote environments in education emerged during the first wave of COVID-19 pandemic crisis, Moodle learning management system (LMS) was widely used in higher education. Under these requirements, both students and teachers had to ...
Predicting Student Dropout in a MOOC: An Evaluation of a Deep Neural Network Model
ICCAI '19: Proceedings of the 2019 5th International Conference on Computing and Artificial IntelligenceMassive Open Online Courses (MOOCs) have transformed the way educational institutions deliver high-quality educational material to the onsite and distance learners across the globe. As a result, a new paradigm shifts as to how learners acquire and ...
Comments