Abstract
Massive open online courses (MOOCs) have given global learners access to quality educational resources, but the persistent high dropout rates problem has a serious impact on their educational effectiveness. Therefore, how to predict the dropout in MOOCs and make advance intervention is a hot topic in the research of MOOCs in recent years. Traditional methods rely on handcrafted features, the workload is heavy, and it is difficult to ensure the final prediction effect. In order to solve this problem, this paper proposes an end-to-end dropout prediction model based on convolutional neural networks to predict the student dropout problem in MOOCs and it integrates feature extraction and classification into a single framework, which transforms the original timestamp data according to different time windows and automatically extracts features to achieve better feature representation. Extensive experiments on a public dataset show that our approach can achieve results comparable to other dropout prediction methods on precision, recall, F1 score, and AUC score.
Similar content being viewed by others
References
Balakrishnan G, Coetzee D (2013) Predicting student retention in massive open online courses using hidden Markov models. Electrical Engineering and Computer Sciences, University of California at Berkeley
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breslow L, Pritchard DE, DeBoer J, Stump GS, Ho AD, Seaton DT (2013) Studying learning in the worldwide classroom: research into edX’s first MOOC. Res Pract Assess 8:13–25
Chaplot DS, Rhim E, Kim J (2015) Predicting Student attrition in MOOCs using sentiment analysis and neural networks. In Proceedings of the 2015 AIED workshop on intelligent support for learning in groups, pp 7–12
Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on Machine learning, ACM, pp 233–240
DeBoer J, Stump GS, Seaton D, Breslow L (2013) Diversity in MOOC students backgrounds and behaviors in relationship to performance in 6.002 x. In: Proceedings of the sixth learning international networks consortium conference, vol 4
Demar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874
Fei M, Yeung DY (2015) Temporal models for predicting student dropout in massive open online courses. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 256–263
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Halawa S, Greene D, Mitchell J (2014) Dropout prediction in MOOCs using learner activity features. Exp Best Pract Around MOOCs 7:3–12
He J, Bailey J, Rubinstein BI, Zhang R (2015) Identifying at-risk students in massive open online courses. In: Proceedings of the 29th AAAI conference on artificial intelligence, pp 1749–1755
Hone KS, Said GRE (2016) Exploring the factors affecting MOOC retention: a survey study. Comput Educ 98(Supplement C):157–168
Hung JL, Wang MC, Wang S, Abdelrasoul M, Li Y, He W (2017) Identifying at-risk students for early interventions—a time-series clustering approach. IEEE Trans Emerg Top Comput 5(1):45–55
Jiang S, Williams A, Schenke K, Warschauer M, O’dowd D (2014) Predicting MOOC performance with week 1 behavior. In: Proceedings of the 7th international conference on educational data mining
Jordan K (2014) Initial trends in enrolment and completion of massive open online courses. Int Rev Res Open Distance Learn 15(1):133–160
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Li FF (2014) Large-scale video classification with convolutional neural networks. In: IEEE conference on computer vision and pattern recognition, pp 1725–1732
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882 [cs]
Kloft M, Stiehler F, Zheng Z, Pinkwart N (2014) Predicting MOOC dropout over weeks using machine learning methods. In: Proceedings of the EMNLP 2014 workshop on analysis of large scale social interaction in MOOCs, pp 60–65
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Mrquez-Vera C, Cano A, Romero C, Ventura S (2013) Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl Intell 38(3):315–330
Mrquez-Vera C, Cano A, Romero C, Noaman AYM, Mousa Fardoun H, Ventura S (2016) Early dropout prediction using data mining: a case study with high school students. Expert Syst 33(1):107–124
Murthy SK (1998) Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min Knowl Discov 2(4):345–389
Ng AY, Jordan MI (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv Neural Inf Process Syst 16:841–848
Onah DFO, Sinclair JE, Boyatt R (2014) Dropout rates of massive open online courses: behavioural patterns. In: International conference on education and new learning technologies, pp 5825–5834
Ramesh A, Goldwasser D, Huang B, Daume III H, Getoor L (2014) Learning latent engagement patterns of students in online courses. In: Proceedings of the 28th AAAI conference on artificial intelligence. AAAI Press
Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747
Sainath TN, Kingsbury B, Saon G, Soltau H, Mohamed AR, Dahl G, Ramabhadran B (2015) Deep convolutional neural networks for large-scale speech tasks. Neural Netw 64:39
Sinha T, Jermann P, Li N, Dillenbourg P (2014) Your click decides your fate: inferring information processing and attrition behavior from MOOC video clickstream interactions. arXiv preprint arXiv:1407.7131
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Stein RM, Allione G (2014) Mass attrition: an analysis of drop out from a principles of microeconomics MOOC. Social Science Research Network, pp 1–19
Taylor C, Veeramachaneni K, O’Reilly UM (2014) Likely to stop? Predicting stopout in massive open online courses. arXiv preprint arXiv:1408.3382
Veeramachaneni K, Halawa S, Dernoncourt F, O’Reilly UM, Taylor C, Do C (2014) Moocdb: developing standards and systems to support MOOC data science. arXiv preprint arXiv:1406.2015
Wang Y (2013) Exploring possible reasons behind low student retention rates of massive online open courses: a comparative case study from a social cognitive perspective. In: Proceedings of the 1st workshop on massive open online courses at the 16th annual conference on artificial intelligence in education, p 58
Wang F, Chen L (2016) A nonlinear state space model for identifying at-risk students in open online courses. In: Proceedings of the 9th international conference on educational data mining, pp 527–532
Wen M, Yang D, Rose C (2014) Sentiment analysis in MOOC discussion forums: what does it tell us? In: Proceedings of educational data mining
Xing W, Chen X, Stein J, Marcinkowski M (2016) Temporal predication of dropouts in MOOCs: reaching the low hanging fruit through stacking generalization. Comput Human Behav 58(Supplement C):119–129
Yang D, Sinha T, Adamson D, Ros CP (2013) Turn on, tune in, drop out: anticipating student dropouts in massive open online courses. In: Proceedings of the 2013 NIPS data-driven education workshop, vol 11, p 14
Zheng Y, Liu Q, Chen E, Ge Y, Zhao JL (2014) Time series classification using multi-channels deep convolutional neural networks. Web-Age Information Management. Springer, Cham, (Lecture notes in computer science), pp 298–310
Acknowledgements
This work is supported by the National Social Science Fund of China for Young Project (13CYY037) and Educational Informatization Research Center of Hubei, Central China Normal University. We would like to gratefully acknowledge the organizers of KDD Cup 2015 as well as XuetangX for making the datasets available.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The Authors declare that they have no conflict of interest.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Qiu, L., Liu, Y., Hu, Q. et al. Student dropout prediction in massive open online courses by convolutional neural networks. Soft Comput 23, 10287–10301 (2019). https://doi.org/10.1007/s00500-018-3581-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3581-3